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INDETERMINISM IN SCIENCE AND NEW DEMANDS 
ON STATISTICIANS! 


Jerzy NEYMAN 
University of California, Berkeley 


The words “indeterministie study” are used to designate research 
aiming to determine how frequently a quantity X characterizing the 
phenomena considered assumes its various particular values. If the pur- 
pose of research is to establish the exact value of X as a function of 
other variables, then this research is “deterministic.” In the history of 
indeterminism in science four (overlapping) periods are discernible. 

(i) Period of “marginal indeterminism.” This was the period, symbol- 
ized by the names of Laplace and Gauss, in which research in science 
was all deterministic with just one domain, that of errors of measure- 
ment, treated indeterministically. 

(ii) Period of “static indeterminism,” roughly covering the end of 
the nineteenth and the beginning of the twentieth centuries, is symbol- 
ized by names of Bruns, Charlier, Edgeworth, Galton and Karl Pearson. 
Here, the main subject of study was a “population” and efforts were 
made to develop systems of frequency curves to describe analytically 
the empirical distributions. 

(iii) The third discernible period, roughly from 1920 to 1940, may 
be termed the period of “static indeterministic experimentation.” It is 
marked by the name of R. A. Fisher and by his book The design of ex- 
periments. The typical problems considered were: do these two popula- 
tions have the same distributions of X? This and similar questions led 
to the development of basic ideas of tests of statistical hypotheses and 
of estimation, and also of the appropriate techniques. All of these are 
currently at the disposal and in constant use of an applied statistician. 

(iv) The fourth period in the history of indeterminism, currently in 
full swing, the period of “dynamic indeterminism,” is characterized by 
the search for evolutionary chance mechanisms capable of explaining 
the various frequencies observed in the development of the phenomena 
studied. The chance mechanism of carcinogenesis and the chance 
mechanism behind the varying properties of the comets in the Solar 
System exemplify the subjects of dynamic indeterministic studies. One 
might hazard the assertion that every serious contemporary study is a 
study of the chance mechanism behind some phenomena. The statistical 
and probabilistic tool in such studies is the theory of stochastic proces- 
ses, now involving many unsolved problems. In order that the applied 
statistician be in a position to cooperate effectively with the modern 
experimental scientist, the theoretical equipment of the statistician 
must include familiarity and capability of dealing with stochastic 
processes. 


1 Invited address delivered at the Annual Meeting of the American Statistical Association in Stanford, August 
24, 1960. Prepared with partial support of the Office of Ordnance Research. 
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1. INTRODUCTION 


Pp To recently, the stock in trade of a statistician engaged in scientific re- 
U search in some applied field was limited to the prewar theories of statistics 
and of experimental design. Both may perhaps be described as static theories. 
During the fifteen'years elapsed since the conclusion of World War II the trends 
in science have changed remarkably and, if we statisticians want to stay in 
business, we must learn to use more modern tools, and, indeed, must embark 
on fresh theoretical work and develop entirely novel techniques to fit the cur- 
rent trends in science. This work in various new directions is already in progress 
in several statistical centers, but it is my impression that its importance is not 
fully appreciated by the statistical brotherhood at large. Also, it is my im- 
pression that the ties between research in science and modern theoretical 
studies in statistics are not generally understood. 

The purpose of the present paper is to characterize briefly the trends in 
modern science, labeled dynamic indeterminism, that demand fresh develop- 
ments in statistics and to indicate a few examples of important studies, largely 
incomplete, in which success depends not only on the ingenuity of the experi- 
menter or observer but also on the availability of statistical techniques very 
different from those of the prewar era and mostly yet to be developed. 


2. MEANING OF “INDETERMINISM” 


In current literature the word indeterminism is being used in two different 
contexts. First, we occasionally speak of certain phenomena as being deter- 
ministic, and contrast them with some others termed indeterministic. The true 
nature of phenomena is a question uf metaphysics, is debatable and, in order 
to stay on firmer ground, I like to attach the description “deterministic” or 
“indeterministic” not to the phenomena themselves but to our approach to 
these phenomena. 

If, in contemplating a system of phenomena characterized by certain quanti- 
ties X;, Xo, ---, X,, Y, I think of these quantities as subjected to some func- 
tional relations, then my approach is deterministic. In this case, I visualize the 
existence of a known or unknown formula from which the value of one variable, 
say Y, can be exactly calculated from the given values of the other variables 
X1, Xo, +++, Xn. Usually, the subject of study is the establishment of such a 
formula. 

The alternative way of approaching the same phenomena is by visualizing 
that among the variables characterizing them there is at least one, say the 
same variable Y, whose value is not calculable from those of the other variables 
but is determined by a chance mechanism, whose functioning depends upon 
X1, Xo, +--+, X,. In other words, in this alternative way of approaching the 
phenomena, the variable Y is treated as a random variable with its distribution 
depending upon the values assumed by other variables. In studying the phe- 
nomena, instead of trying to compute the value of Y from the given values of 
X,, Xo, +++, X,, we are trying to determine how frequently Y has the value 
unity, how frequently it is equal to two, or how frequently it is between one 
and two, etc. 
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As is well known, the first chance mechanisms to be considered were those 
connected with games of chance, coin tossing, etc. These and similar chance 
mechanisms are also contemplated in current scientific research. However, the 
role of chance mechanisms in science varied in time and four distinct periods 
are discernible. 


3. PERIOD OF MARGINAL INDETERMINISM IN SCIENCE 


Although certain astronomical problems were treated indeterministically 
almost two centuries ago, notably by Laplace, such appearances of indeter- 
minism in astronomy and in other sciences were sporadic and the first domain 
subjected to a thorough indeterministic treatment was the broad domain of 
errors of measurement. Here we owe a tremendous amount to Laplace and to 
Gauss. The indeterministic studies of errors were attached to purely deter- 
ministic studies of the various astronomical and other problems. Observations 
were made on positions of planets and comets. If the positions of a celestial body 
are given for several moments of time, f1, to, - - - , 44, formulas were sought to 
determine the position at any subsequent time ¢t. Unfortunately, as a rule, the 
position of the body at time ¢ calculated from observations at times t, 
to, +: +, tm, differs from that caleulated on the basis of observations, made at 
other times, tm41, 4my2, °° * , tom. This and other similar cireumstances brought 
under consideration the concept of the random error of measurement and 
created the problem of statistical point estimation. As is well known, this in 
turn resulted in the emergence of the idea of the loss function, of the unbiased 
estimate, of the minimum variance unbiased estimate and in the creation of 
the theory of least squares. However, while rich in ideas and results, this early 
period is characterized by the marginal role of the indeterministic point of view. 
The approach to the main subjects of scientific research, such as the motion of 
a given planet, was deterministic. Indeterminism applied 7 to a subdomain, 
to the realm of errors of measurement. 


4. STATIC INDETERMINISM OF THE LATE NINETEENTH CENTURY 


The first studies marked by an indeterministic approach to their main sub- 
ject appeared in the late nineteenth and in early twentieth centuries, in the 
writings of some astronomers (Bruns, Charlier), of the early biometricians 
(Galton, Karl Pearson) and of Edgeworth. The works of this period are char- 
acterized by the realization that the subject of study is a multitude now called 
“population,” of certain objects, rather than the individual objects of this 
multitude. Furthermore, it was apparent that while all the members of a given 
population satisfy a common definition (for example, they may be stars of a 
given category, or individual plants of a given variety of wheat, etc.), they differ 
considerably in their measurable properties. The underlying idea was that a 
chance mechanism must exist, perhaps subject to influences of various external 
factors, which determines that a given leaf of a specified tree is three inches 
long, while the length of the next leaf on the same twig is only two inches and 
a half, ete. 

Since the main subject of studies were populations and since member-to- 
member variability is an outstanding characteristic of a population, the problem 
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arose of inventing mathematical means of describing this variability. Thus, 
the period under discussion was the period of search for convenient formulas, 
and of methods of fitting them, to represent empirical frequency distributions. 
As is well known, there resulted several systems of theoretical frequency curves, 
of which the most successful seems to be that of Karl Pearson. 

As far as studies in science proper are concerned, the success of the in- 
deterministic approach at this period seems to have been rather modest. When 
trying to remember an outstanding success in science (for example, in biology, 
physics, etc.) achieved in the late nineteenth or early twentieth century, as a 
result of the adoption of the indeterministic approach, I can think only of the 
“Law of Ancestral Inheritance,” discovered by Galton and Karl Pearson. This 
is represented by correlation coefficients between the characteristics of relatives 
like father and son, like two brothers (with coefficients about equal to one-half), 
etc. In other domains, the hopes attached to the use of frequency curves and, 
later, surfaces, were disappointed. Furthermore, it seems appropriate to men- 
tion that the conceptual side of mathematical statistics of this period was ex- 
tremely misty, to say the least. 


5. STATIC INDETERMINISTIC EXPERIMENTATION 


The third discernible period of indeterminism in science extends roughly 
from 1920 to 1940 and is marked by the appearance on the scene of R. A. 
Fisher. Conceptually, this period may be termed the period of static indeter- 
ministie experimentation. Beginning with problems of agriculture, it was re- 
alized that there is no such thing as a unique “true” yield of a given variety of 
wheat per unit area. On the other hand, a kind of reality may be claimed for 
the existence of a population of yields that this variety may give if grown on a 
population of such unit plots. The distribution in a population of this kind may 
be, and has been, studied in so-called uniformity trials. The samples obtained 
indicated unimodality and, if one feels indulgent, an approach to normality. 

The problem of experimentation with several treatments reduces to the de- 
velopment of techniques, involving random sampling, whereby one can decide 
whether the population of yields under treatment 7 has a higher mean than 
another population of yields of the same wheat grown under treatment 72. As 
is well known, Fisher’s work on this subject, as reflected in his memorable book, 
“The Design of Experiments,” [1], broke new ground in many respects and 
had, and still has, enormous influence on experimentation not only in agricul- 
ture, but also in innumerable other domains of research in science. Here, then, 
the adoption of indeterministic approach, involving the realization that the 
experiment is concerned with populations of which random samples may be 
available, played an outstanding role in obtaining important scientific results. 

The same twenty-year period just before Worid War II appears to have been 
very fruitful in developing and sorting out the basic concepts of statistical 
theory. The concept of, and the term, “statistical hypothesis” was born then. 
The same applies to concepts of tests of statistical hypotheses, of errors of first 
and second kind, of power of tests, of interval estimation, ete. Also, a great 
number of techniques referring to these concepts, many of them due to R. A. 
Fisher, were developed at that time. They are precisely the ones that, combined 
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with the techniques of designing experiments, also due to Fisher and his school, 
and supplemented by a few more recent findings, now represent the conventional 
contents of the tool box of a research statistician cooperating with the experi- 
mental scientist. 

While these techniques are durable and are likely to stay with us more or 
less indefinitely, perhaps with minor modifications, it is essential to be clear 
about their limitations: they were designed for “static” research. By this term I 
mean a study reducible to comparisons of two or more populations as they exist 
at a given moment, or moments, to the exclusion of explicit consideration of 
the process of evolution that may be going on. More specifically, the statistical 
methodology of the prewar era was not designed for the study of chance 
mechanisms behind the phenomena developing in time and space. Yet, as I 
will try to illustrate, the predominant trend in modern science is precisely 
towards the understanding of the evolutionary processes subject to chance. It 
is in connection with such studies that I use the term indeterministic dynamics 
and its various modifications. 


6. INDETERMINISM IN DYNAMIC STUDIES 


When sketching a history of scientific thought and trying to indicate periods 
of birth and development of particular ideas, one is invariably confronted with 
the difficulty of overlaps. It frequently happens that, while several successive 
decades are dominated by a particular idea A, another idea B is born quite 
early in the same period. It is there all the time but is dormant only to emerge 


in full bloom some years later. The best one can do is to assign the idea B a 
period beginning with the time it appears to dominate the thinking of leading 
scientists. From this point of view, the period of indeterminism in dynamic 
studies began quite recently; it is upon us now, but it is good to be aware that 
the first attempts at indeterministic dynamic approach to phenomena (Mendel- 
ism, statistical mechanics) were made in the middle of the last century, some 
one hundred years ago. 

The essence of dynamic indeterminism in science consists in an effort to in- 
vent a hypothetical chance mechanism, called a “stochastic model,” operating 
'-on various clearly defined hypothetical entities, such that the resulting fre- 
quencies of the various possible outcomes correspond approximately to those 
actually observed. 

The scientific value of a given model depends on the degree to which it satis- 
fies two criteria which I shall label (i) the criterion of broad applicability and 
(ii) the criterion of identifiability of details. 

The term “broad applicability” is used to describe the possibility of deducing 
from the model verifiable consequences relating to categories of observation 
other than those for which the model was originally constructed. The term 
“identifiability of details” refers to the possibility of identifying in the empirical 
universe the items that correspond to the various hypothetical entities involved 
in the model. In order to illustrate these definitions I shall refer to some familiar 
examples. 

Turning to the Galton-Pearson Law of Ancestral Heredity, we notice that 
the fitting of a regression line to the observations of fathers and sons, and the 
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discovery that the correlation coefficient equals one-half, does not satisfy the 
definition of the dynamical indeterministic approach. In the process of fitting 
a line there is no stochastic model involved and, if a term is needed, then the 
term “interpolatory procedure” would seem appropriate. The same remark ap- 
plies to the use of randomized block or Latin square designs in an agricultural 
trial. It is true that some authors write the familiar formula. 


+86 (1) 


where R; and C; denote the “row effect” and the “column effect,” respectively, 
and describe it as a “model.” This use of the word is different from mine. The 
reason is that the subsequent analysis, particularly clear in the case of random- 
ized blocks, implies the recognition that the fertility of single plots within a 
block is not the same but varies. Also, it is admitted that this fertility may well 
vary in a systematic way for, otherwise, there would be no need to randomize 
the blocks. Thus, the row and column components in a Latin square design 
combine to produce an approximation to the admittedly irregular fertility level 
of the experimental field, and the role of the lay out (1) is quite similar to that 
of linear or quadratic or hormonic interpolation we are so accustomed to use 
with the various numerical tables. 

The first really significant stochastic model of a broad class of phenomena 
seems to have been provided by Mendel to explain, or to represent, heredity. 
In its simplest form, the Mendel Law asserts that a simple hereditary trait de- 
pends upon hypothetical entities, called genes, of which every organism has 
two for each trait. These genes may be identical, say AA, or aa, in which case 
the individual concerned is called homozygous. However, an individual organ- 
ism may have in a given locus two different genes Aa. It is then called hetero- 
zygous or hybrid. Whatever the case may be, the progeny inherits from each 
parent one of the available genes, with probability one-half for each. 

At the time of its invention, the Mendel model of heredity was little more 
than an interpolatory procedure invented to summarize Mendel’s experiments 
with peas. However, in due course, it was found that the stochastic mechanism 
invented satisfies both criteria (i) and (ii) to a remarkable degree. For example, 
not only did it appear that the inheritance of traits other than color of flowers, 
and for organisms other than peas, showed a similar segregation, but also 
simple calculations showed that, if the stature in man depends on a number of 
pairs of genes each inherited in accordance with the simple Mendel Law, then, 
in the absence of dominance and with random mating, the correlation between 
father and son should be one-half. Similar agreement between the Mendel 
model and the Galton-Pearson empirical findings was established for other 
degrees of family relationship. It is this variety of aspects of heredity deducible 
from the original model (occasionally combined with extra assumptions like 
that of random mating, independence, etc.) that constitutes the satisfaction of 
the criterion of “broad applicability.” Naturally, this alone contributes con- 
siderably to the scientific interest of the model. However, the emergence of this 
model in its present glory begins with the moment it began to satisfy the second 
criterion, that of “identifiability of details.” 
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The hypothetical entities involved in the original Mendel model are genes. 
So long as there were no data indicating the nature of genes or their location, 
the esthetic value of the model was limited. However, presently it was dis- 
covered that the chromosomes visible under the microscope, must be the car- 
riers of the genes. Furthermore, a method was developed to determine whether 
any two pairs of genes are carried by the same chromosome or not. Finally, 
if three hereditary traits A, B and C have their genes in the same chromosome, 
it is possible to find whether the genes B are located between A and C or not. 
In fact, it became possible to map the location of several different genes along 
the chromosome which carries all of them. It is still true that no one has seen 
a separate gene. However, the progress in the identification of the originally 
hypothetical genes has been remarkable and this degree of satisfaction of cri- 
terion (ii), combined with that of criterion (i), determines the importance of 
the Mendelion model of heredity. 

After a somewhat slow start about a century ago, the indeterministic dynamic 
approach to phenomena gradually gained momentum. For example, already 
before the last World War, serious efforts were made to construct a convincing 
stochastic model of epidemics. These efforts did not quite succeed because of the 
great complexity of the phenomenon, but served as an inspiring stimulus. Cur- 
rently, dynamic indeterminism is in full swing and it is hard to think of a 
serious scientific study whose main subject is other than the construction or 
verification of a stochastic model. This is true in all domains of human thought: 
economics and other social sciences, technology, modern physics, population 
studies including the struggle for existence, medicine and cosmology. 

The remainder of this paper is given to two examples of dynamic indeter- 
ministic studies, both now in progress and both far from completed. In dis- 
cussing these examples I shall try to emphasize the relationship of the models 
considered to the criteria of broad applicability and of identifiability of details. 
It will be seen that the attempts to satisfy these criteria must demand the use 
of probability and statistics on a level much higher than that customary before 
the war. In fact, any such attempt requires new statistical and probabilistic 
results. 


7. STOCHASTIC MODELS OF CARCINOGENESIS 


The individual-to-individual variation in the occurrence of cancer is so great 
within any species that it may be taken for granted that the understanding of 
carcinogenesis is possible only in terms of a chance mechanism. Thus, many 
attempts have been made to construct a stochastic model of carcinogenesis and 
quite a few of such models were recently reviewed at the Fourth Berkeley Sym- 
posium on Mathematical Statistics and Probability. 

Each model is constructed with reference to a particular set of observations 
or experiments. Certain models of carcinogenesis are based on the analysis of 
death rates from cancer of humans, compiled for particular countries. The 
basic idea is that, in order that cancer occur, a certain number k of some hypo- 
thetical events must occur in the individual considered. Assuming that the 
probability of any such event occurring per unit of time remains constant, a 
formula can be deduced for the probability that an individual of age X will 
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have cancer. The comparison of this formula with the age specific death rates 
yields an estimate of k. 

It is obvious that the model just described is far from satisfying the criterion 
of identifiability of details. Also, I should mention the well known fact that the 
incidence of cancer varies considerably from one social group to another. Thus, 
the death rates based on mass observations must be average death rates and it 
is rare that the average of several different functions of a given category is 
represented, with reasonable accuracy, by a function of the same category. For 
example, the average of two negative exponentials 


+ /2 (2) 


may be more easily approximated by exp { —z?} than by a simple exponential 
exp { —a,x}. It follows that the shape of the curve representing the average in- 
cidence of cancer as a function of age, treated as an indicator of a plausible 
mechanism of the origin of cancer, may be misleading. 

The first comprehensive model of carcinogenesis seems to be due to two 
Danish scholars, Niels Arley and Simon Iversen. In an apparent desire to satisfy 
the criterion of identifiability of details, Arley and Iversen considered labora- 
tory experiments with cancer induced in animals by the application of certain 
carcinogens. Here the animals were reasonably uniform and detailed informa- 
tion was available as to the dosage of the carcinogen and of the method of its 
application. The experimental variable was mostly the so-called induction time, 
that is, the time interval between the first application of the carcinogen and 
the appearance of a detectable tumor. 

The basic assumption of the model is that the carcinogen, whether a chemical 
or some irradiation, produces a “hit” on a sensitive molecule in a cell, called the 
cancer control center. As a result of a hit, the cell undergoes a mutation-like 
change and becomes a cancer cell. A hit on the cancer control center may be 
delivered either by the carcinogen itself (direct effect) or by a molecule of an- 
other substance activated by the carcinogen (indirect effect). The mathematical 
deductions of the two authors appear to have been limited to the hypothesis 
of direct effect. The instantaneous probability of a cancer forming hit was as- 
sumed to be a function of the dose of the carcinogen applied. In constructing 
this function, the authors allowed for the possibility that the carcinogen will 
kill certain cells, whether mutated or not. 

In the model, the induction time is treated as the sum of two intervals, one 
from the first application of the carcinogen to the first hit, and the other from 
the first hit to the appearance of a detectable tumor. This latter interval, de- 
scribed as the growth interval and denoted by t,, is assumed to be a random 
variable with an unknown distribution. The authors’ calculations are based on 
the tentative assumption that t,, is normally distributed with unknown mean 
and variance. 

The deductions from this model, in the form of the distribution of induction 
time with reference to the given dose and the method of application of the 
carcinogen, were compared with the results of an impressive array of experi- 
ments. The results of Arley & Iversen [2] appeared in a series of papers published 
in the Acta Pathologica et Microbiologica Scandinavica. Here, then, a conscious 
effort was made to satisfy’the criterion of broad applicability. 
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From most of these trials, the Arley-Iversen model came out with flying 
colors. The calculated distribution of induction time appeared to agree closely 
with that observed by different experimenters working on different strains of 
animals treated with different carcinogens. However, there was one exception. 
The exceptional data refer to the experiments of Harold F. Blum [3] on mice 
irradiated with ultraviolet rays. 

The essential disagreement between the Arley-Iversen model and the actual 
mechanism of carcinogenesis came to the fore when an attempt was made to 
use the constants involved in the model as estimated from certain basic experi- 
ments of Blum, say A, in order to predict the distribution of induction time in 
other experiments, say B, conducted somewhat differently. The experiments I 
designate by the symbol A consisted in irradiating mice at regular intervals, 
for example, every day, once a week, etc. The particular exposures varied from 
four minutes to 45 minutes. The essential feature of experiments A was that 
the recurrent irradiations continued until the mice died. Using these experi- 
ments, Arley and Iversen estimated the parameters involved in their model 
and computed the theoretical distribution of the induction time corresponding 
to all the different combinations of intervals between irradiation and exposure 
length. In all these cases the agreement between theory and observations was 
satisfactory. The experiments B of Blum differed from those I designated by 
A by that the regularly administered doses of radiation did not extend over the 
lifetime of the mice but were interrupted after 74 days in one case, and after 88, 
95, and 166 days in the three other cases. 

The Arley-Iversen model is specific and a formula representing the theoretical 
distribution of the induction time can be deduced from it for each of the four 
series of experiments of category B. Obviously, if the model corresponded 
exactly to the true mechanisms of carcinogenesis, the constants of the model 
deduced from experiment A would produce theoretical distributions of induc- 
tion time for experiments B in reasonable agreement with what was obtained 
empirically. Unfortunately, no such agreement was observed and the empirical 
distributions of induction time in Blum’s exneriments B are of a character en- 
tirely different from that predicted by the model. 

The comments of Arley and Iversen are to the effect that the discrepancy 
between their formulas and the observations may be due either to the fact that 
the actual effect of irradiation is indirect or to the unrealistic character of their 
tentative assumption that the time of growth of the tumors is normally dis- 
tributed. I am looking forward to seeing the results of further study by the 
same authors to elucidate these points. 

It is appropriate to mention that the indirect character of the effect of the 
carcinogen is being suspected by a number of biologists. Partly, in order to in- 
vestigate this point, a series of very detailed experiments was performed by 
Polissar and Shimkin [4]. The suspected indirectness of the effect of a car- 
cinogen was visualized somewhat differently than by Arley and Iversen. While 
Polissar and Shimkin express their ideas very tentatively and carefully so as 
not to appear jumping at conclusions, for the sake of brevity of the present 
account I shall be more specific. The basic idea seems to be that a living cell 
affected by the carcinogen undergoes a mutation-like change “of the first order.” 
These first-order mutants multiply faster than normal cells. Aiso, they die. The 
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rate of multiplication is less than the rate of death so that, eventually, all the 
first-order mutants disappear. However, the first-order mutant cells are candi- 
dates for another mutation, the so-called second-order mutation. The second- 
order mutants are cancer cells. 

It will be seen that the above model, which Polissar and Shimken must have 
had in their minds, is much more specific than that of Arley and Iversen and is 
more readily open to detailed empirical verification. If first-order mutants exist 
—what about finding them in the bodies of the experimental animals? If such 
presumed first-order mutant cells are found, and if they are really predecessors 
of cancer, then the parts of the experimental animals rich with first-order 
mutants must show more cancer than those that are poor. Can this be observed? 

Polissar and Shimkin used a chemical carcinogen which they injected in 
varying doses, to mice. Rather than to wait for the appearance of the first 
“noticeable tumors” (according to one estimate, a “noticeable” tumor may 
mean 60 million cancerous cells!) Polissar and Shimkin sacrificed groups of 
mice at regular intervals from the very start of the experiment and investigated 
their lungs under the microscope. They identified accumulations of cells sus- 
pected to be first-order mutants and called them hyperplastic foci. The esti- 
mated numbers of such foci per lung are given at weekly intervals. The same 
applies to the average volumes of the foci and, of course, to the numbers and to 
the volumes of tumors. It will be seen that these experiments reach far deeper 
than those limited to the appearance of a noticeable tumor. 

This is the latest development in the biological study of carcinogenesis that 
I am going to describe. It is interesting by itself. However, my purpose is to 
discuss the demands that this research addresses to the statisticians. The sta- 
tistics needed in order to deal with the Arley-Iversen model is limited to opera- 
tions on variables following the negative exponential and the normal distribu- 
tion. This is within easy reach of every university graduate in statistics. How- 
ever, it is hardly disputable that the part of the Arley-Iversen model concerned 
with the growth time of cancer is somewhat less distinct than their earlier 
quantum theoretical treatment of what happens within a cell as a result of a 
hit. The Polissar-Shimkin experiments, giving hyperplastic cells per unit of 
area of a slice of a lung inspected under the microscope, and also giving the 
number of tumors and their sizes, do not leave any room for a summary treat- 
ment by supposing that something or other is normally distributed. In order 
to treat their experiments with the attention they deserve, we are forced to 
consider a quite complicated stochastic process. One part of this process deals 
with the number X(¢) of first-order mutant cells alive at time t. This is the so 
called birth-and-death-with-immigration process on which important results 
are due to Richard Bellman and T. E. Harris in this country, to M. 8. Bartlett 
and D. G. Kendall in England, and to A. N. Kolmogorov and A. M. Yaglom 
in the U.S.S.R. Unfortunately, the easy part of the theory assumes that the 
probability that a given cell will divide within a short interval of time does not 
depend on the age of the cell, something which is inconsistent with the observa- 
tions. On the other hand, if one adopts age dependent probabilities of multi- 
plication of cells, one encounters considerable difficulties, [5], [6]. 

The second part of the process Y(t), that concerned with tumors, is also a 
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birth-and-death-with-immigration process, connected with the first. My own 
effort in the study of the problem resulted in a formula giving the probability 
distribution of the number of those tumors which are destined to grow in- 
definitely (killer tumors). This formula, ref. [7], relates to an arbitrary meth- 
od of application of the carcinogen, whether by injection or by irradiation 
at a constant or at a varying rate Unfortunately, in order to be able to handle 
the mathematics involved, I had to assume the unrealistic hypothesis men- 
tioned above concerned with the instantaneous probabilities of multiplication 
of the cells. However, this formula appears to agree qualitatively with the re- 
sults of Blum’s experiments both A and B. 

D. G. Kendall [8] managed to deduce formulas, also based on the same 
simplifving hypothesis, giving the moments of the joint distribution of X(t) 


Fie. 1. Part of a hyperplastic focus. The cells represent the 
hypothetical first-order mutants. 


and Y(t). However, this distribution itself could not be obtained. This latter 
distribution, and even its moments expressed as functions of time, are rather 
important because the suspected connection between the hyperplastic cells and 
tumors can be established or disproved only by the study of the joint distri- 
bution of the numbers of the presumed first-order mutant cells and of the 
tumors. Figures 1 and 2, taken from Polissar and Shimkin [4], show, respec- 
tively, a diffuse hyperplastic focus and a tumor in a lung of a mouse. The 
biologists can and do count them. (However, thus for only separate counts of 
hyperplastic foci and separate counts of tumors are available. It may be hoped 
that, in due course, there will be simultaneous counts made of both these entities 
per unit volume of the lung.) What is demanded from a statistician is the theo- 
retical counterpart: the joint distribution of the two variables X(t) and Y(t), 
deduced on reasonably realistic assumptions. Unfortunately, thus far this dis- 
tribution, or anything like it, is not in our tool box. 


. 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1960 


8 STUDY OF THE POPULATION OF COMETS IN THE SOLAR SYSTEM 


When trying to seleet another example of applied statistical research in science 
requiring new theoretical statistical results, an example to accompany that of 
the study of carcinogenesis, it was natural for me to think of astronomy and, 
in particular of the study of galaxies, in which, under the influence of some of 
my colleagues, | personally have been engaged for a decade or so. However, 


apart from articles in the astronomical literature, two broad accounts of this 
work were recently published in the statistical journals [9] and [10]. For this 
reason, rather than to speak of galaxies, | wish to bring to vour attention the 
problem of comets in the solar system. Very interesting papers by Hammersley, 
D. G. Kendall and Lyttleton, were recently presented at the Fourth Berkeley 
Symposium. 

The extra reason for my speaking of comets is that, to my knowledge, the 
problem of the origin of these objects, whether the comets originated in the 


Fig. 2. Tumor in a mouse’s lung, a clone of presumed second-order mutant cells. 


Solar System or in the interstellar space, was the first problem in science treated 
indeterministically, in which indeterminism was not marginal but applied 
directly to the main subject of study. Also, as far as I am informed, this par- 
ticular study led to the first test of a statistical hypothesis. 

This work first published in 1773, is due to Laplace [11]. His reasoning was 
that, since the orbital planes of all planets in the Solar System are very nearly 
parallel, if the comets are regular members of the same system, then the planes 
of their orbits should also have a “preferred” direction. The alternative to this 
hypothesis was that the comets are interstellar objeets which invade the Solar 
System, in which case the distribution of the angle, say ¢, between the ecliptic 
and the orbital plane of the comet should be uniform from —2/2 to +72. 

Here, then, the main subject of study was not any particular comet or comets 
as such, but the population of comets of which the comets known at the time 
constituted a random sample. The statistical hypothesis H that Laplace tested 
was that ¢ is uniformly distributed over the interval (— 2/2, r/2). The criterion 
selected for the test, selected on purely intuitive grounds, was the arithmetic 
mean @ of observed angles , ¢,. For purposes of the test Laplace 
deduced the exact distribution of é as implied by H. The deduction is perfect. 
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However, in his attempt to make a plot of the probability density of ¢, Laplace 
made a most curious mistake which appears to have prevented him from seeing 
that, with very moderate values of n, the probability density of ¢@ is very near 
the normal density. As is well known, the density of ¢ is a combination of n 
branches of parabolas, each of order n—1. If n=2, then the graph of the density 
forms a rectilinear triangle. If n=3, there are three branches of second-order 
parabolas, combining smoothly into a bell-shaped curve. This latter circum- 
stance was overlooked by Laplace and his graphs looked as in the right side 
of Figure 3. 


—a +a —a +a 
correct shape shape visualized by Laplace 
Fic. 3. Graphs of probability density of the mean of three independent variables 
uniformly distributed between —a and +a. 


From the point of classification given in the present paper, Laplace’s study 
could hardly be considered as representing dynamic indeterminism. This is not 
true of papers by Hammersley, Kendall, and Lyttleton. Here, the astronomical 
ideas originated from Lyttleton. Hammersley and Kendall were concerned with 
certain statistical aspects of the problem, (see [12], [13], and [14]). 

According to the hypothesis of Lyttleton, the comets originate from inter- 
stellar dust. When the Sun passes through a dust cloud, the interstellar dust 
converges to the axial line behind the Sun and forms condensations. These con- 
densations are then picked up by gravitation and thrown in pursuit of the Sun. 
Each such condensation of dust, following a very elongated orbit, is a comet. 
Most cometal orbits are elliptical but others may be parabolic or hyperbolic. 

The character of a cometal orbit, whether elliptical, parabolic or hyperbolic, 
is characterized by just one parameter, the total energy of the comet. If this 
energy is positive, the comet has a hyperbolic orbit and will escape from the 
solar system. The same thing will happen if the energy is zero, in which case the 
orbit is parabolic. If the energy is negative, the orbit is an ellipse. 

When a comet comes into the vicinity of the Sun, its orbit is subjected to 
perturbations by the planets, particularly by Jupiter and Saturn, whereby its 
energy receives a positive or a negative increment. Thus, the energy of a comet 
after its nth passage around the Sun, is the sum of its initial energy X and of 
n components due to perturbations 


k=1 


When this sum becomes positive or zero, the comet leaves the solar system. 
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In their models, Hammersley and Kendall treat the components Y;, as identi- 
cally distributed independent random variables and study the stochastic 
process E,. The subject of their study is the time in years (not the number of 
passages near the Sun) between the first arrival of a comet and its ultimate 
departure from the solar system. The interest of the study and also its difficulty 
is connected with the fact that the duration 7,,,1 of the n+ 1st revolution de- 
pends on E£,, and tends to infinity as EZ, tends to zero. 

The observations on known comets, assumed to be a random sample of the 
total population, provide estimates of the total number of comets present in 
the solar system. Here again, then, the effort to treat realistically a problem 
in the applied field required a delicate study of a stochastic process. 

While admiring the work of Lyttleton, Hammersley and Kendall, | would 
like to start the stochastic model a little earlier than they did. Also, I would 
like to inelude in the model a certain element that has been neglected. 

Consider a moment ¢ in time measured from some arbitrary origin, and a 
line L representing the path of the solar system. Let L be the OX axis of an or- 
thogonal system of coordinates with the origin at the Sun and let the positive 
direction of this axis be that opposite to the Sun’s motion. For every point 
(x, y, 2) With x >0O, and for every ¢, let 


Nt, a, Ay? + 22)r + ofr) (4) 


represent the probability that in the interval of time from ¢ to t+7 there will 
be a comet born at (x, y, z). To this comet I would ascribe an initial velocity, 
treated as a random variable with some distribution. The assumptions just 
made determine the probability distribution of the initial energy of the comet, 
sarlier denoted by X. Although the problem may be very difficult, it may then 
be possible to evaluate the distribution of energy of a comet in the vicinity of 
the Sun, without it being specified whether the observations refer to the first, 
second, or any other visit of the solar system. Then it may be possible to relate 
the observations to the properties of the function A. Obviously, the larger the 
value of A, the denser is the dust surrounding the path of the Sun. 

In this enticing program there is a considerable difficulty, which, I suspect, 
affects also the study of Lyttleton, Hammersley and Kendall. This difficulty is 
connected with the very plausible assumption that the comets actually ob- 
served are not all the comets that approach the Sun in any given period. The 
fainter comets, and also those that fail to arrive in a close vicinity of the Earth, 
must escape notice from time to time. Thus, in order to connect realistically 
any given stochastie model of the population of comets with the observations, 
it is necessary to introduce the probability that a comet of specified properties 
will be observed. In our studies of galaxies we found it necessary to deal with 
the probability of this kind. We assumed that it depends solely upon the appar- 
ent brightness of the galaxy, which depends in a tricky way on the galaxy’s 
intrinsic brightness and on its distance. With comets, this dependence is some- 
what simpler, but even so is likely to cause trouble. 
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9. CONCLUSIONS 


The general discussions of the earlier part of the paper and the two examples 
of current indeterministic studies of natural phenomena illustrate the general 
statement that since the end of World War II there has been a fundamental 
change in the character of research in sciences requiring the cooperation of a 
statistician. With due allowances for some exceptions, one might say that the 
period before the war was the period of static indeterminism in science. The 
term stochastic process, while already born, was mostly beyond the horizon 
of an applied statistician and of a scientist. Currently, in the period of dynamic 
indeterminism in science, there is hardly a serious piece of research which, if 
treated realistically, does not involve operations on stochastic processes. The 
time has arrived for the theory of stochastic processes to become an item of 
usual equipment of every applied statistician. 
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MARKET GROWTH, COMPANY DIVERSIFICATION 
AND PRODUCT CONCENTRATION 
1947-1954* 


L. NELson 
National Bureau of Economic Research 


The hypothesis that market growth leads to reduced sellers’ concen- 
tration is presented and briefly elaborated. Then, after an evaluation 
of the data, they are used to make a direct and an indirect test of the 
hypothesis. In the direct test, a negative relationship is found between 
concentration change and market growth, both for aggregate changes 
and for those of individual markets. In the indirect test concentration 
change is found to be related to the increasing importance of multi- 
product plants in a fashion that implies that the increased diversifica- 
tion of companies, rather than that of their plants alone, has brought 
lower levels of product concentration. Both tests lend positive though 
not conclusive support to the hypothesis. 


1. INTRODUCTION 


poem trend of competition in the United States has eluded precise description 
and for good reason. Competition, like other pervasive economic forces, 
operates in markets in a variable and often unpredictable fashion. The collec- 
tion by business of “devices for circumventing barriers to profits” is a changing 
process, if partly in the semantics employed, and economists, not to mention 
judges, have been hard put to formulate precise standards for classifying mar- 
kets as competitive or monopolistic. Where such classifications have been made, 
professional economists have arrived at substantially different results even 
though they had at their disposal much the same set of facts.! 

This paper will focus on one element in the change in competition, the struc- 
ture of product markets, or product concentration. While it is gratifying to 
know that economists have usually considered the element of structure as one 
of the more important determinants of competitive behavior, it is disquieting 
and reassuring to know that they have never been content to let their judgment 
stand on this one leg alone.? It is hoped that while the paper may not provide 


* Author’s Note: This article is a revision and extension of a paper presented to the American Statistical Associ-~ 
ation in December 1959. The data on which most of the analysis is based were developed in 1958-9 at the U. 8. 
Census Bureau while on leave-of-absence from Northwestern University. The research was conducted with the 
support of the Committee on Economic Census Data of the Social Science Research Council as part of its Census 
Monograph program; their assistance is gratefully acknowledged. Thanks are also due many persons at the Census 
Bureau who gave generously of their time and energies in producing the basic data, to Irwin Silberman who aided 
in their analysis, to Professors George Stigler and Morris Adelman for helpful suggestions and to both referees 
whose thorough comments added much to its precision and readability. Needless to say the views expressed are not 
necessarily those of the SSRC, the Committee on Economic Census Data, the Census Bureau, or of any of the 
individuals mentioned. Full responsibility for them, as well as for any remaining faults and errors, remains mine. 

1 For an illuminating comparison of attempts to measure the extent of competition and monopoly in the 
American economy see Solomon Fabricant, “Is Monopoly Increasing?”, Journal of Economic History, Winter 1953, 
pp. 89-94. 

2A variety of analyses of the relation between the structure of an industry and its competitive performance 
have been made; e.g. see G. J. Stigler, “A Theory of Delivered Price Systems,” American Economic Review, De- 
cember 1949, and his “Competition in the United States,’ Five Lectures on Economic Problems, London, Longmans 
Green, 1949, Also G. W. Nutter, The Extent of Enterprise Monopoly in the United States, 1899-1939, University of 
Chicago Press, 1951; T. Scitovsky, “Economic Theory and the Measurement of Concentration,” Business Concen- 
tration and Price Policy, National Bureau of Economic Research, 1955; R. Heflebower, “Monopoly and Competi- 
tion in the United States of America,” in E. Chamberlin ed. Monopoly and Competition and their Regulation, New 
York, St. Martin’s Press, 1954; and numerous studies of individual industries. 
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complete answers to questions concerning the trend in competition, it may at 
least help economists to reduce the uncertainty in their judgments. 

The specific purpose is to make a preliminary test of the hypothesis that 
sellers’ concentration will tend to decline in a growing economy. As markets 
grow in size, at least for the growth stages beyond the periods of industry for- 
mation and early acceleration, one might expect an increasing divergence be- 
tween the individual market and the firm’s part of it. There are a number of 
reasons for this. For one thing there may be limits on the optimum size of 
plants, and growth may then take the form of increasing the number rather 
than the size of plants producing a given product. Multi-plant operation may 
involve increasing costs of coordination and control, putting an upper limit on 
the number of plants a firm can economically administer. However, even if 
substantial economies of administration, marketing, and finance are to be 
gained through multi-plant operation, there may be good reasons for growing 
by diversification rather than by increasing or maintaining one’s share of a 
given product market. A greater opportunity to achieve over-all stability in 
sales, and a larger number of more profitable new product lines into which it is 
possible to diversify may accompany the growth of the economy. This, together 
with a fear of anti-trust prosecution may lead a firm to diversify rather than 
grow in its traditional market. Stated more broadly, the growth of an economy, 
and its markets, may make feasible a greater exploitation of the division of 
labor along both industry and functional lines, and lead to lower concentration 
levels.® 


2. EVALUATION OF DATA 


To test the hypothesis the change in concentration of 399 product classes 
between 1947 and 1954 will be examined, the group being limited to these prod- 
uct classes for which reliable concentration measures could be computed.‘ The 
selection represeats 35 per cent of the 1,132 five-digit product classes into which 
manufacturing output is divided and 39 per cent of the 1,023 product classes for 
which concentration measures were presented by the Senate Anti-trust Sub- 
committee.’ The product classes in the sample are on the average both larger in 
size and more highly concentrated than all product classes (Table 1). While 
comprising only 35 per cent of the total nwmber of product classes, they ac- 
counted for 48 per cent in 1947 and 52 per cent in 1954 of the dollar value of all 
product class shipments by manufacturing establishments. 


4 It is not clear, however, that a decline in product concentration levels through diversification in the manu- 
facturing stage of production and distribution will necessarily increase competition. Such diversification may be 
based on the economies of large scale distribution and, especially where concentration levels were high to begin with, 
may be used to effectively exclude new competition. This argument is examined more fully by Joe Bain in Barriers 
to New Competition, Harvard, 1956, Chapter 3. 

‘ In the planning stages of the project, product class tabulations were projected for the years 1947, 1954-5-6-7, 
using Annual Survey of Manufactures files for the four later years. A product class, to be selected had to be com- 
parable in definition in the several years. In addition, to insure reliability in measurement, it had to have 70 per cent 
of its shipments made by establishments having 100 or more employees in 1954 or 1955, the Annual Survey panel 
including these larger establishments on a non-sampling or certainty basis. Moreover, the four-digit industry, to be 
selected, had to have 90 per cent of its shipments accounted for by five-digit product classes that were comparable 
and that promised reliability. It was later decided to make product class tabulations for 1947 and 1954 using 
Census files, for reasons of time and cost. The groups of industries (and product classes) was not reselected, however, 
as it was felt that reselection would not change the selection very much. 

5 Senate Anti-trust and Monopoly Subcommittee, Concentration in American Industry Washington, 1957. 
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TABLE 1. ANALYSIS OF SELECTION OF 399 PRODUCT CLASSES, 1954, 
BY SIZE OF PRODUCT CLASS* AND CONCENTRATION RATIO» 


Concentration Ratio Class 


80-100 | 60-79.9 | 40-59.9 | 20-39.9 |; 0-19.9 Total 


Product Class Size 


Selected Product Classes as Per Cent 
of Total Number of Product Classes 


$1 Billion & over 
500-999 million 
200-499 million 
100-199 million 
99 million 


Total 


Total Number of Product Classes* in Universe 


$1 Billion & over 
500-999 million 
200-499 million 
100-199 million 
0- 99 million 


Total 


® Measured by net value of shipments. 
> Share of product class shipments of four largest shippers of the product class. 
© Concentration in American Industry, Table 7, p. 14. 


The seven-year period 1947-1954 is relatively short to identify trends, how- 
ever it was not possible to construct concentration measures from Census 
files before 1947.6 Therefore the main justification for speaking in terms of 
trends is the substantial growth in manufacturing output that took place. 
Total shipments by manufacturing establishments grew from $178 billion to 
$273 billion, or 53 per cent. A more meaningful measure, value added by manu- 
facture, which excludes the value of raw materials purchases from mines and 
forests, and the double counting of intermediate product shipments from one 
manufacturing establishment to another, increased from $74 billion to $117 
billion or 58 per cent. After crude adjustment for wholesale price level changes, 
“real” manufacturing value-added increased a substantial 38 per cent. 

In the present analysis concentration measures based upon the product con- 
cept rather than the industry concept will be used.’ That is, the four largest ship- 


hitch 


* The construction of the 1947 measures required the processing of roughly 450,000 product 
shipment records and was possible only because the classification system used in 1947 permitted fairly direct re- 
coding into the 1954 system. Files of individual establishment records for earlier Censuses had been destroyed and, 
even if available, recoding would have been prohibitively costly if possible at all. 

7 The basic Census unit of reporting is the establishment, and the establishment is classified by the product that 
accounts for the largest part of its shipments (primary product). Measures of industry size are computed by summing 
the establishments having a given classification number. They therefore include the production of secondary prod- 
ucts produced by establishments in the given primary product classification, and exclude the production of products 
in this classification that are produced by establishments in other classifications. The industry concept thus yields 
an imperfect measure of product output to the degree that multi-product establishments are important in (and out) 
of the industry. The product concept, on the other hand, counts the shipments of a product in any given classification 
without regard to the industry classification of the establishments producing it. 


+ 

100 100 82 69 0 68 

80 100 60 48 27 52 

76 61 73 57 13 55 

35 72 38 36 10 36 

53 31 34 23 2 30 

56 48 46 36 39 

: 3 6 11 13 z 40 

5 9 10 27 22 73 

17 33 56 53 38 197 

20 25 65 73 40 223 

57 105 138 149 41 289 

: | 102 178 280 315 148 1,023 
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pers® of a given product class are used to compute the concentration ratio re- 
gardiess of the industry classification of the establishments making these ship- 
ments. In like manner the total shipments of the product class are determined 
without regard to the industry classifications of all the establishments ship- 
ping the product class. Two kinds of errors that have plagued users of industry 
data are thus avoided. One is the overstatement of an establishment’s ship- 
ments of its most important product that results from the necessary convention 
of assigning all of an establishment’s activity to only one industry classification, 
even though it may produce several products. The second is the exclusion of es- 
tablishments assigned to other classifications, even though a substantial minor- 
ity of their shipments may be of the product in question. As others have care- 
fully demonstrated, these errors of plant homogeneity and coverage can lead 
to substantial and capricious misstatements for concentration measures based 
on the industry concept.® 


3. CHANGES IN COMPANY CONCENTRATION 


As described by any of the three positional measures presented in Table 2 
the level of concentration for the selected product classes declined from 1947 
to 1954.!° There are, however, some problems of bias that need resolution before 
the finding can be related to the market growth-concentration decline hypoth- 
esis. As pointed out above, the selected product classes are on the average both 
larger and more highly concentrated than all product classes. As the selection 
was made on the basis of 1954 data, the selected group is probably biased to- 
ward an increase in concentration levels, as a 1947 product class showing a 
decline in concentration by 1954 would have been less likely to be selected than 
one that showed an increase. 

The group is also biased towards the more rapidly growing product classes. 
A 1947 product class having a low rate of growth to 1954 would have been less 
likely of selection than one having a high rate. This bias is reflected in’ the 
growth of the selected and unselected groups. The increase in the aggregate 
manufacturing shipments of the 399 product classes was 69 per cent compared 
to 39 per cent for the unselected group. That the 399 showed a decline in con- 
centration level suggests that the aforementioned bias toward the inclusion of 
product classes of increased concentration was more than offset by the reducing 
effect of market growth on concentration. The finding for the group of 399 and 


8 By “largest shippers” is meant the largest establishments without regard to ownership if the purpose is to 
examine establishment (plant) concentration or the largest groups of establishments owned by single companies 
if the purpose is to examine company (firm) concentration. 

9 See especially Maxwell R. Conklin and Harold T. Goldstein, “Census Principles of Industry and Product 
Classification, Manufacturing Industries” in Business Concentration and Price Policy, National Bureau of Economic 
Research, 1955, pp. 15-36. Also, Betty Bock, Concentration Patterns in Manufacturing, National Industrial Cone 
ference Board Studies in Business Economics, No. 65, Chapter 2. 

10 The mean concentration ratio of the 399 product classes, computed from the grouped data of Table 2, de- 
clined 1.75 points, from 54.43 in 1947 to 52.68 in 1954. The probability of drawing a pair of random samples having 
observed difference in means from universes of identical means is about one in ten. While the probability is low 
enough to cast doubt on the hypothesis that no decline occurred in the average concentration of all product classes, 
it is not low enough to be statistically significant by usual standards. More important, the assumption of a random 
selection of product classes is clearly unwarranted, as the text explains. It is for this reason that emphasis is placed 
on an evaluation of the biases of selection rather than on formal statistical tests, in generalizing from the selected 
group to the universe. 
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TABLE 2. DISTRIBUTION OF COMPANY CONCENTRATION RATIOS, 1947 
AND 1954, 399 FIVE-DIGIT PRODUCT CLASSES AND 137 OF THE 138 
FOUR-DIGIT INDUSTRIES ENCOMPASSING THE 399 PRODUCT CLASSES 


Percentage of Percentage of 
Product Classes Industries 


1947 to 1954 Ch 
Product Class or Industry o 1954 Change 


Share of Four Largest 
Companies (Per Cent) 


Product In- 
1954 1947 1954 1947 Classes dustries 


80-100 14.3 14.3 13.1 10.9 0.0 +2.1 
70-79 .9 8.5 10.0 7.3 —1.5 —4.4 
60-69 .9 13.0 15.0 14.6 10.9 —2.0 +3.7 
50-59 .9 14.0 16.8 13.1 15.3 —2.8 —2.2 
40-49 .9 18.3 16.0 14.6 12.4 +2.3 +2.2 
30-39 .9 17.5 14.0 15.3 it +3.5 +3.6 
20-29 .9 10.5 10.5 10.9 17.5 0.0 —6.6 

0-19.9 4.0 3.3 10.9 9.5 +0.7 +1.0 


° 


First Quartile 68.3 69.5 66.8 67.8 —1.2 —1.0 
Median 49.8 53 .6 48.7 49.1 —3.8 —0.4 
Third Quartile 36.0 37.9 32.0 28.8 9 +3.2 


Note: Detail may not add to totals due to rounding. 


generalization of its biases support the inference that concentration decline is 
a concommitment of market growth. 

As a more direct test of the hypothesis a random sample of 57 product classes 
was drawn, one-seventh of the 399, and concentration change was correlated 
with the growth in product class shipments between 1947 and 1954. If market 
growth did result in lowered concentration levels, then one should find an in- 
verse relationship between the change in concentration ratios and the rates of 
growth of product class output. An inverse relationship was found for the 
sample of 57, and the negative correlation was large enough to support the in- 
ference that, had correlation measures been computed for the whole group of 
399, a negative correlation would have been found (Table 3). The rank correla- 
tion was of almost the same size whether concentration was measured as the 
share of the four largest or eight largest firms. This suggests that the choice of 
concentration measure is probably unimportant for present purposes. 

It would be mistaken to conclude from these findings that, for an unweighted 
distribution, the level of concentration all product classes declined between 
1947 and 1954. True, the selection was biased in favor of product classes of in- 
creasing concentration and if this group showed a decline in concentration the 
non-included group, other things equal, should have shown an even larger de- 
cline. Offsetting this bias, however, is the relatively lower aggregate growth 
rate of the non-included group. To the extent that market growth is a factor in 
concentration decline the non-included group should have shown a smaller 
decline in concentration than the selected group, or possibly an increase. The 
evidence thus permits no strong inference about an unweighted total distribu- 
tion of concentration changes. The weighted distribution of course would be 


00M | 100.0 100.0 
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TABLE 3. COMPARISON OF 1947 TO 1954 CHANGES IN PRODUCT CLASS 
CONCENTRATION AND MARKET GROWTH, 57 PRODUCT CLASSES 


Probability that 
Sample Was Drawn 
from Universe of 
Zero Rank Correlation 


Rank Correlation 
Coefficient 
(Spearman) 


Relationship Between Changes In: 
Largest Four Firms’ Share of Ship- 
ments and Ratio of Total 1954 to 
Total 1947 Shipments 


Largest Eight Firms’ Share of Ship- 
ments and Ratio of Total 1954 to 
Total 1947 Shipments 


Largest Four Firms’ Share of Ship- 
ments and Largest Eight Firms’ 
Share of Shipments + .942 


more likely to show a greater decline than the unweighted since the selected 
group, for which a decline was registered, accounted for an increasing share of 
total shipments. 


4. INDUSTRY CONCENTRATION AND PRODUCT DIVERSIFICATION 


Comparison of changes in concentration measures based on the industry and 
product concepts may reveal something of the role that secondary product 
trends have played in product concentration change. If there has been an in- 
crease in the number and importance of multi-product establishments then 
concentration measures based on the industry definition should show an in- 
creasingly higher level relative to those based on the product definition. 

The trend in industry concentration for 137 of the 138 four-digit industries 
encompassing the 399 five-digit product classes is also presented in Table 2. 
Changes in the three positional measures there presented are much smaller, 
on the average, than the changes in equivalent measures for product classes. 
More significant, where all three measures for product classes showed down- 
ward movements, two of the measures for industries show smaller downward 
movements while one shows a larger upward movement, the net change in the 
three measures being positive. These findings suggest that increases in second- 
ary product activities, increases that are not revealed in industry data, may 
have been a factor in the decrease in product concentration levels. 

It is not possible to directly examine the 1947 to 1954 change in the diver- 
gence of industry and product concentration. Four-digit product class group 
measures, having the same classification (though not production unit) bound- 
aries as the four-digit industry measures, do not exist for 1947. It is possible, 
however, to examine industry and preduct divergence in 1954 a year for which 
both measures are available (Table 4). The table shows that the industry defi- 
nition produced a much greater number of higher 1954 concentration ratios 


—.214 
— 216 .054 
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TABLE 4. DIVERGENCE OF FOUR-DIGIT CONCENTRATION RATIOS, 1954, 
BASED ON INDUSTRY AND PRODUCT CONCEPTS, ALL MANUFACTURING 
INDUSTRIES AND 138 INDUSTRIES ANALYZED IN THIS PAPER 


Number of Industries} Percentage of Total 


All 138 


Ratio on Industry Basis Higher Than on 
Product Basis by: 
10 Per cent or More 
5 to 9 Per cent 
2 to 4 Per cent 


Difference of 1 Per cent or Less 


Ratio on Product Basis Higher Than on 
Industry Basis by: 
2 to 4 Per cent 
5 to 9 Per cent 
10 Per cent or More 


Not Available 8. 


Total 100.0 100.0 


Source: Concentration in American Industry, Table 35, p. 39; Table 37, pp. 41-62; Table 42, pp. 196-219. 


than did the product definition not only for the industries here examined, but 
for all industries in much the same degree. The finding, while not conclusive, 
at least carries the presumption that the divergence was lower in 1947. 

Other evidence supports the inference that the divergence between industry 
and product concentration was lower in 1947 than in 1954. For 344 industries 
permitting comparison, of a total of 447, the share of secondary products pro- 
duced by establishments in given industry classifications rose by two or more 
percentage points in 118 and fell by the same amount in only 102 (Table 5). 
Moreover, for 347 industries permitting comparison the share of the total 
product produced as secondary products by establishments outside the indus- 
try rose by two or more percentage points in 130 and fell by the same amount 
in only 101. Both changes reflect an increased diversification in the product-mix 
of establishments. 

This finding of an increased diversification of establishments does not neces- 
sarily signify an increased divergence between industry and product concen- 
tration. It is conceivable that it reflects only the greater decentralization of 
product output among and diversification within plants owned by the same 
companies, with no change in their total product-mixes." If this were the case 
the divergence between industry and product concentration would not have 
changed. This does not seem to have been entirely the case, however. 


" Such decentralization and diversification would also be reflected in an increased divergence between company 
and establishment product concentration measures. This is examined in detail in my paper “Measures of Product 
Concentration, 1947 and 1954, Developed From Census Materials,” American Statistical Association Proceedings of 
the Busi and E. ics Section, 1959, Washington, 1960. 


27 8 6.0 5.8 

77 28 17.2 20.3 

104 25 23.3 18.1 

162 59 36.2 42.8 
26 5 5.8 3.6 

7 3 1.6 2.2 

j 1 1.6 0.7 

3 6.5 
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Comparison of changes in secondary product activities and industry concen- 
tration measures suggests that the increased diversification of establishments 
was to some degree related to the increased divergence between industry and 
product concentration. This is most clearly apparent in Table 6 for the indus- 
tries permitting comparison among the 138 selected for this study, the 138 
industries for which the increased divergence between industry and product 
concentration was recorded in Table 2. Changes in concentration measures 
based on the industry concept are positively related to changes in the secondary 
product shipments of establishments classified in the given industry (the 100 
industries in the upper part of the table). To a lesser degree they are also posi- 
tively related to changes in the share of shipments of the given product by es- 
tablishments classified in other industries (104 industries in the lower part of 
the table). 


TABLE 5. SECONDARY PRODUCT ACTIVITY CHANGES IN 
MANUFACTURING ESTABLISHMENTS, 1947 TO 1954 


Secondary Product Shipments | Given Product’s Shipments by 
as Per Cent of Total Shipments} Establishments Classified In 
of Establishments Classified Other Industries as Per Cent 


Change in Secondary in Given Industry of Total Shipments of Product 


Product Activity 
(Percentage Points) | 344 of Total 101 of 1388 | 3470f Total 105 of 138 
of 447 Selected of 447 Selected 
Industries Industries Industries Industries 


+10 or more 
+ 5to +9 
+ 2to +4 
+ 1to-l 
— 2to —4 
5 to -—9 
—10 or less 


Source: Concentration in American Industry, Table 45, pp. 266-85. 


These findings support the following explanation of the process by which 
concentration measures based on the industry concept became progressively 
larger than those based on the product concept. Industry concentration meas- 
ures may have increased more often than not because the establishments of 
firms in an industry devoted increasingly larger (though still minor) shares 
of their output to secondary products, production still assigned to the primary 
product (industry) classification. This had the effect of overstating the firm’s 
share of the market for its primary product. In addition, the increasing produc- 
tion of the given product as asecondary product by establishments outside the 
industry, production classified in other industries, led to an understatement of 
the total size of the product’s market. Both trends had the effect of increasing 
the measure of industry relative to that of product concentration. 


15 5 20 5 
; 42 12 43 15 
61 16 67 20 
124 38 116 39 
61 21 60 18 
29 8 28 6 
; 12 1 13 2 
344 101 347 105 

a 
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The evidence to support a generalization to all industries of the finding for 
the 138 selected industries is not strong. A positive relationship is also found 
for the larger group between changes in industry concentration and changes in 
the secondary product shipments of establishments in a given industry (Table 
6, 346 industries in the upper part of the table). However it is somewhat less 


TABLE 6. RELATIONSHIP BETWEEN CHANGES IN SECONDARY 
PRODUCT ACTIVITY AND CONCENTRATION, 1947 TO 1954 
FOUR-DIGIT INDUSTRIES 


Change in Share Ratio of Actual Number of Industries in Each Cell to the 
of Four Largest Number that Would Be Found if There Were No Relationship 
Firms (Percentage Between Concentration Change and Secondary Product 
Points) Activity Change 


Change in Secondary Product Shipments as Per Cent of Total Shipments of 
Establishments Classified in Given Industry (Percentage Points) 


346 of Total of 447 100 of 138 Industries 
Manufacturing Industries Selected in This Study 


+2or more +1 to —1 —2orless|+2or more +1 to —1 —2or less 


+2 or more ‘ 0.98 
+1to —1 1.20 
—2 or less a 0.92 


Change in Given Product’s Shipments by Plants Classed in Other Industries as 
Per Cent of Total Shipments of Given Product (Percentage Points) 


347 of Total of 447 104 of 138 Industries 
Manufacturing Industries Selected in This Study 


+2ormore +lto —1 —2orless|+2ormore +1to —1 —2or less 


+2 or more 
+l1to -1 
—2 or less 


Source: Concentration in American Industry, Table 45, pp. 266-85. 


pronounced than that found for the selected group for which corroborative 
direct evidence was available. Moreover, there appears to be no relationship 
for the larger group between changes in industry concentration and changes in 
the share of shipments of given products by establishments outside the indus- 
try (347 industries in the lower part of the table). A slight positive relationship 
was found for the selected group. While the findings for the larger group are 
not in contradiction to those for the selected group, they are not in strong 
agreement. At most they provide weak support for the inference that the in- 
creased divergence between industry and product concentration was a general 
phenomenon. 


0.88 1.26 0.88 0.87 

1.13 0.74 |e 1.16 

1.07 0.81 1.09 1.15 0.89 0.93 

P| 0.77 1.47 0.75 0.59 1.49 0.91 

1.02 0.96 1.03 1.07 0.84 1.03 
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5. SUMMARY AND INTERPRETATION 


The hypothesis that market growth leads to lower levels of product concen- 
tration received positive support from two tests. First, it was found that the 
level of concentration declined in those product classes that, in the aggregate, 
showed a higher than average rate of growth. More convincing, when concen- 
tration change was correlated with growth rate for individual product classes, 
a negative and statistically significant relationship was found. 

The second test indicated that product concentration change was related to 
changes in the secondary product production, or diversification, of companies. 
Only data on the diversification of establishments were available, however it 
was possible to show that the change in establishment diversification was not 
neutral with respect to company concentration change. The increased diversi- 
fication of companies explains the increased divergence between company con- 
centration measures based on the industry and product concepts. The increased 
divergence of the two measures in turn reflects the growing importance of the 

secondary products of companies in aan concentration in their several 
product markets. 

While the hypothesis receives support from the experience of the seven-year 
period 1947-54, there is a question as to whether the period itself is appropriate 
for such a test. Certainly one could place more confidence in tests based on a 
period much longer than seven years, as the processes of market growth and 
structure change are much longer-run processes. For this reason alone the tests 
permit no drawing of final conclusions. There is yet another reason for question- 
ing the appropriateness of the period, and that is its beginning year 1947, a 
year of postwar readjustment. This may be especially important in interpreting 
the effect of product diversification on concentration «ange. If wartime pro- 
duction controls narrowed the product-mix of establishments and companies, 
the immediate postwar year of 1947 may still have reflected the forced special- 
ization of production rather than a more diversified peacetime pattern. To the 
degree this is true this paper may be more a description of an aspect of war-to- 
peace economic readjustment than of the evolution of free markets. 


ON FINITE SAMPLE DISTRIBUTIONS OF GENERALIZED 
CLASSICAL LINEAR IDENTIFIABILITY TEST STATISTICS* 


R. L. BASMANN 
Technical Military Planning Operation 
General Electric Company 


In the estimation of econometric simultaneous equations models, 
hypothesized necessary conditions for the identifiability of a single 
equation usually specify the exclusion of a number of variables from the 
structural equation in question. If the pre-determined variables are 
completely exogenous, if the disturbances in the equations are jointly 
normally distributed, and if a moderately high degree of precision can 
be obtained in reduced-form estimation, then the exact finite sample 
distribution of the generalized classical linear identifiability test statistic 
can be closely approximated by Snedecor’s F with appropriate degrees 
of freedom. 


1, INTRODUCTION 


HE purpose of this note is to make more widely known some results of a 
7 experiment, Basmann [5], that have significant implications for 
the practical use of identifiability tests and confidence regions in econometric 
statistics, and that suggest potentially fruitful lines of mathematical inquiry 
into the question of the finite sample distributions of the former. Several con- 
jectures about the finite sample distributions of the LVR (least variance ratio) 
[1, 2, 13] and GCL (generalized classical linear) [3, 14, 15] identifiability 


statistics have been put to test with interesting results. The derivation of these 
conjectures was not arbitrary. For instance, the limiting distributions of the 
LVR and GCL test statistics are known, [1, 2, 4, 12]; as a matter of heuristic 
and practical strategy, it is natural to select for initial study those finite sample 
distributions that possess the required limiting form and have been tabulated. 


Let 
= + Xini* + + (1.1) 


denote a single structural equation in a system of G structural equations. y; de- 
notes a column vector of N independent observations of an endogenous variable 
yi; Y2 denotes a matrix (VXG—1) of N independent observations of G—1 
additional endogenous variables ys, ys, -- +, ye hypothetically appearing in 
(1.1); X, denotes a matrix (V x K;) of independent observations of exogenous 
variables (see Section 2, fin. 3) hypothetically appearing in (1.1); X2 denotes a 
matrix (N X K2), exogenous variables hypothetically absent from (1.1), but ap- 
pearing in one or more of the reduced-form equations [13, pp. 135 ff.] corre- 
sponding to the endogenous variables ys, ys, - - - , Ye. Be*, v:*, y2* are column 
vectors of G—1, K, and Ky components, respectively; the asterisk is used to 
denote population values. Finally, e., denotes a column vector of N independent 
and identically distributed random variables with 


* A part of this work was performed under Contract No. AT(45-1)-1350 between the Atomic Energy Com- 
mission and General Electric Company. 
1 A second line of attack has been inspired by the early papers of Geary [11] and Fieller [10]. 
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E (e.1) = 0, 


E(e.se’ 1) = (1.3) 
The identifiability hypothesis is 
Ho: y2* = 0. (1.4) 


We shall confine our attention to the case in which K,>G; that is, to over- 


identifying hypotheses. 
Let 82 denote a column vector of G—1 components defined over the space 


of Form 
G(B2) = (ya — [Iv — — (1.5) 


G2(B2) = (y.1 — — — Y282), (1.6) 
Q(B2) = Gi(62) — G2(62), (1.7) 
Gi(82) — G2(B2) 
Gx(B2) (1.8) 


GCL estimates of 8.* are obtained by minimization of (1.7) with respect to G2 

[cf. 3]; denote these estimates by 62; LVR estimates of 8.* are obtained by mini- 

mization of (1.8) [ef. 1, 2, 13]; denote these estimates by #». Finally, let 

Gi(B2) — G2(8 
1(B2) 2(B2) 9) 

G2(82) 


_ GG.) — 


1.10 

If the identifiability hypothesis (1.4) is valid, then, with N-> =, 
Né~ x’, df. (1.11) 


and 


df. (1.12) 


under fairly general conditions [1, 2, 3, 13]. If the identifiability hypothesis 
(1.4) is not valid, then 


lim NE(¢) = + ~, (1.13) 


and 
lim NE(¢@) = + (1.14) 


except in a special case.? Consequently, the critical region (of rejection) 


N¢o= (1.15) 
where ¢ denotes either ¢ or ¢, has been proposed as a large-sample test of 


? The validity of (1.4) implies that the rank condition holds [cf. 13, p. 138]; the converse, however, is not true. 


= 
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identifying restrictions. [Cf. 13, pp. 178-83, and 4]. The critical region (1.15) 
has also been used as an approximate finite sample test. [Cf. 16, p. 184]. 

If, however, the reduced-form disturbances are jointly normally distributed 
with zero means and covariance matrix >, for a fixed vector #2, and 
K Ky +K2, 

N-K 
~ (1.16) 


Anderson [1] and Anderson and Rubin [2], have proposed the critical region 


N-K 
= Fr,.n-x(©) (1.17) 


as an approximate finite sample test of identifying restrictions. These authors 
have pointed out that this test is “conservative” in the sense that the prob- 
ability of rejecting the null hypothesis, when it is valid, is less than e. 

The following considerations suggest an appropriate alternative. As N in- 
creases indefinitely, the distributions of (V—K)¢ and (N—K)¢ converge to 
the distribution of a x? variate with K,—G+1 degrees of freedom, whereas the 
distribution of (VN — K)Fx,.v-x« converges to the distribution of x? with Kz de- 
grees of freedom. Consequently, there is an (heuristic) inconsistency between 
the asymptotic critical region (1.15) and the proposed finite sample critical 
region (1.17). On the other hand, critical regions based on Fx, ¢4:,v-x are con- 
sistent with (1.15). Moreover, GCL estimation of £.* from the sample imposes 


G—1 linear restrictions on Q(82) [4]; that is, Q(82) is a quadratic form of rank 
K.—G-+1. These considerations suggest the use of critical regions 


(1.18) 


(1.19) 


as approximate finite sample tests of identifying restrictions. 


2. EXPERIMENTAL RESULTS 


In order to investigate the reliabilities of approximate finite sample identifi- 
ability tests based on the critical regions (1.15), (1.17), and (1.18), and (1.19), 
and to evaluate the potential fruitfulness of a proposed mathematical investi- 
gation of the finite distribution function of ¢, I have conducted an experiment 
with 200 independent samples of endogenous variables generated by the popu- 
lation defined in Tables 1, 2, 3.8 


3 A comprehensive description of the experimental design is provided elsewhere (Cf. 7 Section 3]. The sampling 
experiment served as an adjunct to some mathematical investigations of the eract finite sample frequency functions 
of GGL coefficient estimators. As the mathematical theorems refer only to the case in which no lagged endogenous 
variables appear among the predetermined, it was deemed wise to confine the initial experimental work to that 
case, in order that the experimental results in respect of the identifiability test statistic, interval estimates, and 
estimates of residual variances of e; in (1.1) be explicitly connected with the exact mathematical results obtained 
for estimates of and 


N-K 
x. 
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TABLE 2a. “TRUE” REDUCED FORM EQUATIONS 


ri riz ria ris ris 


(2.1) ya= 0.731524 —1.34247 —1.245lza +1.69652 —7.2763 +na 
(2.2) ye= 0.597324 —0.51362 +0.13232¢ +0 .0447 244 +1.0856 +ne 
(2.3) ya= —1.5800zr —0.6540z ¢s +1.1910r +0.4000 —10.0700 +n¢ 


TABLE 2b. VARIANCE-COVARIANCE MATRIX: 
REDUCED-FORM DISTURBANCES 


448 . 2882 —63 .2987 121.4734 
= =(cni) = 


21.2150 —20.7019 
303.0165 


TABLE 3. EXOGENOUS VARIABLES 


170 
162 
166 
155 
159 
164 
147 
149 
163 
152 
170 
144 
141 
148 
141 


The reduced form disturbances m,; were generated from £,,~n (0, 1) by the 
transformation: 


m = 14.71815 &; — 11.99752 & — 9.36609 
2.05985 & + 4.11971 (2.1) 
ns = 10.05015 &; + 10.05015 & — 10.05015 &; 
The structural equation (1.1) in Table 1 was estimated by LVR and GCL 
methods under the correctly specified hypothesis 
Ho:yu* =0, = 0, = 0. (2.2) 
The 200 independent experimental values of ¢ and ¢ have been used to test the 
following conjectures: 


dist. N In (1 + ¢)~x?, Ke -G+1 df., (2.3) 


“a 


dist. K @ ~ Fx,.n-x, (2.4) 


2 
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| 157 64 12 79 95 : 
90 25 68 72 
67 45 67 89 
68 24 74 81 
56 37 70 100 
68 12 23 98 
| 61 38 51 80 
90 10 60 72 
54 14 23 85 
83 15 35 79 
50 49 87 76 
50 14 89 86 
88 18 21 97 
99 32 55 97 
97 16 47 87 
. 72 25 48 | 103 
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dist @~F (2.5) 


dist. Né~ x?, -G+1 (2.6) 


dist. K (2.7) 


dist. K.-@H1 ~ (2.8) 
with N=16, G=3, K=6, K2=3. The conjectures are to be read: e.g., (2.3) the 
distribution of N In (1+¢) can be closely approximated (numerically) by the 
distribution of x? with K.—G+1 degrees of freedom. 

Results of tests of conjectures (2.3), (2.4), and (2.5) have been reported else- 
where in another connection [6]. Briefly, the conjectures (2.3) and (2.4) were 
strongly contradicted by the experimental results, which indicate that the 
LVR test of identifiability based on the critical region (1.15) is markedly biased 
against a valid identifiability hypothesis, and that the LVR test based on (1.17) 
is extremely biased in favor of a valid identifiability hypothesis. In the case of 
(2.4), the maximum deviation between the experimental distribution and the 
conjectured distribution was non-significant, although a slight bias in favor of 
the identifiability hypothesis was indicated. 

In Table 4 are shown 200 independent observations of ¢ arranged in order 
of increasing magnitude. It is believed that the values shown are correct to 
within one unit in the fifth decimal. The experimental mean #, and standard 
deviations are shown in the heading. The empirical probability (set) function 
[cf. 9, pp. 56-8]. 

Prol$ <2} (2.9) 
200,99 < 200 
where »(x) is the number of observed experimental values of ¢ less than z, 
z=0, .001, .002, - - - , is exhibited in Figure 1. Superimposed on Figure 1 are 
several points of the distribution functions of #5Fi, 10 and 23/15. 

Test of Conjecture 2.6. Since 6>¢>I1n (1+¢) as 6-0, the unfavorable out- 
come of the test of conjecture (2.3), already noted, implies a fortiori an un- 
favorable outcome of the test of conjecture (2.6). Visual inspection of Figure 1 
indicates a uniform and large deviation between the empirical and conjectured 
distribution functions, and that the critical region (1.15), ¢=4¢, is strongly 
biased against the valid identifiability hypothesis. The maximum absolute de- 
viation between the empirical probability function (2.9) and the conjectured 
distribution function (of 27,5) is located (by visual inspection) at z=.103 and 
is approximately dzo9=.15. Since the empirical probability (set) function 
P200(¢<2), (2.9), closely approximates the empirical distribution function F200(x) 
defined by 


F2oo(x) = Proold < x} (2.10) 
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TABLE 4 
8s = .33990 med. = .05070 


-00915 - 15881 
-00947 - 15933 
-91025 - 16089 
-01068 -16341 
-01094 - 16577 
-01096 - 17036 
-01105 .17375 
-01141 : 17433 
01183 17855 
-01310 - 18876 
-01328 19290 
-01371 a - 19533 
-01385 19866 
-01526 -20077 
-01694 20407 
-01780 -2.898 
-01789 .21924 
-01806 -22702 
.01817 -23196 
-01937 -23355 
-01945 -23985 
-02020 -25083 
-02077 - 25993 
-02127 26132 
.02242 . 27182 
-02383 28189 
-02441 ‘ 29244 
-02512 -30500 
.02626 -30759 
-02788 -32072 
.03221 -32356 
-03359 -33025 
-03449 -35010 
-03535 -35173 
-03589 -35330 
-03659 d .36076 
-03754 -38044 
-03854 -41225 
-03857 
-03910 
-04005 
-04060 
‘ -04160 
.00699 -04175 
.00718 -04185 
-00736 -04293 
-00817 -04350 
-00898 -04403 
-00899 -04710 
-00902 04861 
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1.00 


-90 


-80 
"Empirical Distribution of 
P(P<x) ---- 
Distribution of ecco 
Distribution of 1/10 F,,,0000: 

1 

qT T 


1 


where z is continuous and n(x) denotes the number of observations of ¢ that do 
not exceed z, [12], we make use of the limiting form of the Kolmogorov-Smirnov 
distribution in the test of this conjecture. Let Deoo= sups| F(z) — F2o0(zx) | where 
F(z) is the conjectured distribution; then 


P{ Ds > .15} < .01 (2.11) 


under the hypothesis that ¢ is distributed like x{,;5 with one degree of freedom 
[cf. 8, p. 427]. This result is, as was expected, in strong contradiction to the 
conjecture that 16¢~ 7. That is to say, the finite sample distribution of 164 
does not closely approximate its known large sample (asymptotic) distribution. 
[cf. 4, Section 5]. 
al of Conjecture 2.7. A test of this conjecture would be superfluous, 
cf. 6. 

Test of Conjecture 2.8. The maximum absolute deviation between the em- 
pirical probability function (2.9) in Figure 1, and the conjectured distribution 
is located (visually) at s=.100 and is approximately d2oo=.02. Given that the 


experimental ¢ is distributed as pgF;1,10, 
P{ Dao > .02} = 1.0. (2.12) 
A x? goodness-of-fit test employing six intervals with suprema equal to the 


p=.10, .30, .50, .70, .90, points of the 7>F:,:0 distribution yields a criterion 
value \ 


u = .3500, (2.13) 
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from which 


> .3500} > .995, (2.14) 


a result fully consistent with the result (2.12) of the Kolmogorov-Smirnov test. 
Visual inspection of Figure 1 leads to the conclusion that the goodness-of-fit 
criterion, u, would not be changed significantly by increasing the number of 
intervals. 

The most remarkable feature of the experimental results is the almost per- 
fect agreement between the conjectured distribution 7F1,10 and the empirical 
distribution of the GCL statistic ¢. Indeed, even if conjecture (2.8) is valid, 
the extremely close fit is a rare event. For all that, the most we can validly 
assert is the (rather obvious) fact that the experimental results are consistent 
with the conjecture. 


3. CONCLUSION 


In the Introduction the conjectures (2.4) and (2.8) were motivated primarily 
on the grounds of their being consistent with the known asymptotic distribu- 
tions (1.11) and (1.12) of the LVR and GCL test statistics ¢ and ¢. The con- 
jecture (2.8) was motivated, however, by more fundamental considerations in- 
timately related to the question of what parametric conditions are sufficient to 
ensure that certain non-linear functions of GCL coefficient estimators 2, 41, 
are approximately jointly normally distributed; these considerations involve a 
generalization of the early contributions of Geary [11] and Fieller [10].4 The 
experimental results indicate the potential fruitfulness of further mathematical 
investigation of these parametric conditions. 

The experimental results also indicate that it is desirable to reexamine the 
conclusions of empirical studies in which identifiability tests based on the criti- 
cal regions (1.15) or (1.17) have been used, or in which confidence regions based 
on F’x,,y-x have been applied. Finally, it would be desirable to apply identifi- 
ability tests in some previous studies in which they have not been applied. 
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A NOTE ON THE LIMITING RELATIVE EFFICIENCY OF 
THE WALD SEQUENTIAL PROBABILITY RATIO TEST* 


Rosert BECHHOFER 
Cornell University 


The efficiency (measured in terms of ratio of average sample size to 
fixed sample size) of the Wald sequential probability ratio test relative 
to the best competing fixed sample procedure for testing Ho:@ =o versus 
H,:0=6;(80 <0) when X is distributed N(X | 6, o?) with known o? is 
studied for fixed (0, @0, 6:) as the specified probabilities of error, a (the 
probability of rejecting Hy when @=6») and 8 (the probability of ac- 
cepting Ho when @=@;), approach zero. The limit is shown to depend on 
the relative rates at which a and 8 approach zero in a prescribed man- 
ner. In particular, when aa =8(a>0) this limiting relative efficiency is 
equal to (0; / (4| +6 ). The practical implications of this re- 
sult are discussed. 


1, THEORETICAL DEVELOPMENT 


E shall be concerned with the classical problem of testing Ho:@= 6» versus 
H,:0=0,(00 <0) when X is distributed N(X|6, with 0(— <@<+~) 
unknown and o*(¢?>0) known. The test is to have the properties that 


Prob { Reject Ho| @ = 4} < a 
Prob { Accept Ho| @ = 6:} <8 


where 0<a, 8<1 and a+8<1. As a solution to this problem Wald proposed 
his sequential probability ratio test (WSPRT) [9]. We shall study the efficiency 
(measured in terms of ratio of average sample size to fixed sample size) of the 
WSPRT to the best competing fixed sample procedure as (a, 8) approach zero 
in a prescribed manner. 

We shall use many of the results given in [9] and, in general, shall adopt the 
same notation. However, for simplicity we let 


6 = 0/6 — (00 + (2) 
= (0; — %)/20; (3) 
then 6=6) and 6=@, are equivalent to = —é* and 56=6*, respectively. Using 
(3:48) and (3:60) in [9], the Wald approximations (3:43) and (3:57) to the 


true OC (probability of accepting Ho) and ASN (average sample size) curves 
become, respectively, 


(1) 


— 


L() ~ — (4) 


and 
L(5) log B + [1 — L(6)] log A 


266* 


E,(n) ~ (5) 


where, as in [9] p. 45, we let A =(1—8)/a and B=6/(1—a). The ratios of these 
approximations (which neglect the “excess over the boundary”) to the corre- 


* This research was sponsored by the United States Air Force through the Air Force Office of Scientific Re- 
search of the Air Research and Development Command under Contract AF 49(638)-230. 
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sponding true values are known to approach unity very rapidly as the ASN 
grows large. (It has been shown in [5} that the right members of (4) and (5) 
are actually lower beunds to the true OC and ASN, respectively.) 

We define Ap by the equation 


=f. exp {—42?}dx = P (6) 
where 0<P<1. Then the fixed sample size which guarantees (1) is (see (3:67)) 
the smallest integer equal to or greater than 
(Aa — Ar-s)?/4(5*)*, (7) 
and the relative efficiency (neglecting excess) is given by 
26* L(6) log B + [1 — L(6)] log A 
6 


RE ~ (a, B; 8, 8*) = 
a 


(8) 


We shall obtain the limit of (8) when 8 = aa? (a, b>0) and a—0. Now the sec- 
ond element of the numerator of the r.h.s. of (8) becomes L(6) {log (aa?) 
—log (1—a) } + [1—L(6) ] {log (1 aa”) —log a} which approaches —log a+ L(8) 
{log a+log (aa) } as a—0. It is straightforword (but tedious) using l’Hospital’s 
rule to show that {log a+log } approaches log a+log (aa’) for 6<0 
and zero for 6>0 as a—0. Hence we have that 


[ log aa? 


lim RE~ (a, aa’; 6, 5*) = (9) 


[ —log a | 
or le 
6 a0 (Nw + Naa?)? 


Now it is known (see, e.g., Feller [6], formula (1.7) on p. 166) that for dp large, 
i.e., for P very close to zero, (6) can be written 


1 
exp {—}(rr)*}, (10) 
and hence that log P~ — }(Ap)?— log Ap— log 1/24~—}(Ap)? (since (Ap)? domi- 


nates log \r). Thus, for a very close to zero we have \a~+/ —2 log a and aad 
V/—2 log (aa’); substituting these in (9) we obtain 


— §* 1 2 
| fori <0 
6 F 1 | 

lim RE~ (a, 6, 6*) = (11) 


a0 1 2 iene 
=z] oré > 0. 


When b=1, i.e., when aa=8, (11) becomes 
61 — 
4|6| 4| —20| 


Thus, in the limit (aa=8—0) the efficiency of the fixed sample procedure rela- 
tive to the WSPRT-0 as 6 + ~, it equals } if @=6 or @=4;, it is greater than 


lim RE™~ (a, aa; 6, 6*) = (12) 
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unity if (54)+36,)/8 <@< (30,+54,)/8, and it is infinite if (The 
limiting relative efficiency of } obtained when 6= 6) or = 0, has previously been 
noted by Chernoff [4, p. 19]; the result (11) for 5=8* has been obtained by 
Aivazian [1]. Both of these authors considered the general problem of testing 
a simple hypothesis vs. a simple alternative, and studied the limiting relative 
efficiency as the two hypotheses approach each other. For earlier work on this 
subject see Paulson [8 ].) 


2. DISCUSSION 


In connection with (12) it is of some interest to recall that the WSPRT is 
optimum [10] for the artificial problem of testing the simple hypothesis 
H:0=06 vs. the simple alternative H,:0=6,, and we have noted that for this 
problem it can (theoretically, at least, when aa=8—0) achieve a four-to-one! 
saving in ASN when compared to the best competing fixed sample procedure. 
However, in most practical problems @ is not restricted to taking on only two 
values, but rather is assumed to lie in some (possibly infinite) interval. If the 
WSPRT is used in place of the fixed sample procedure for such problems—for 
example, for those which arise in connection with acceptance inspection (see 
[9], chapter 7), then for a range of values of @ between , and 0, the advantage 
may be reversed, i.e., the fixed sample procedure may yield a many-to-one 
saving in ASN over the WSPRT; in fact, if we consider the limit of (8) as 6-0 
({9], p. 124) we obtain 


{log [(1 «)/8}} {log [(1 — 8)/a]} 


RE™~ (a, 6; 0, &*) = 


+s)? on 


When a= the relative efficiency curve (8) is symmetric about 5=0 and as- 
sumes its maximum at this point. For this special case (13) is greater than 
unity if a=8<0.008 (approx.). This last computation may momentarily give 
the reader some comfort, for he may reason that in practice one seldom uses 
values of a which are less than 0.008, and if such small values of a are avoided 
the WSPRT would appear to yield a uniformly smaller ASN than the fixed 
sample procedure. However, the reader is reminded that (13) is an approzi- 
mation which neglects “excess” in both the ASN and the fixed sample size. 
(“Excess” in the latter comes about because the smallest integer equal or greater 
than (7) is used.) Monte Carlo samplings [3] indicate that the ratio of the true 
ASN associated with the ordinary WSPRT (which sets A = (1—a)/a=1/B) to 
the fixed sample size is greater than unity when 6*=0.2 and 5=0 for values 
of a which are less than 0.015 (approx.). (This critical value of a probably in- 
creases with 6* since then the ASN decreases and the “excess” increases.) 
These results point up the great practical need for considering the possibility 
of devising sequential procedures which guarantee a and 6 at 6) and 6, re- 
spectively, and which minimize (over-all procedures which accomplish this) the 
maximum value of the ASN. (Sueh a procedure is called a minimaz procedure.) 
A characterization of procedures which give this guarantee and which minimize 


1 It might be remarked that the approach to this limit is very slow when regarded as a function of @ and #. 
For example, when a =8 and 5 =6* we obtain 0.416, 0.333, 0,306, and 0.292 frem (8) for a=10-%, 10-4, 10-, and 
10-4, respectively, 
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the ASN at a third point (other than 4) or 6,) has been obtained by Kiefer and 
Weiss [7], Section 4. For our problem, if a=8 the ASN curve is symmetric 
about @=(6,+6;)/2; hence if one chooses this point as the point at which the 
ASN is to be minimized, then their characterization applies to the minimax 
procedure. The sequential procedure proposed by Anderson [2] is an important 
contribution in the direction of minimaxing the ASN. 

Finally we remark that the much-quoted statement that “the WSPRT often 
results in an average saving of 50 percent in sample size” needs to be qualified. 
Clearly this “saving” depends on a, 8, and the true state of nature. 


LIMITING EFFICIENCY OF WALD SEQUENTIAL RATIO TEST 
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INTERNAL MIGRATION STATISTICS FOR THE UNITED STATES* 


Everett §. Lee anp ANNE S. LEE 
University of Pennsylvania 

A methodological survey of statistics on internal migration used in 
the new enlarged edition of Historical Statistics of the /nited States. No 
one body of materials gives a complete picture but much can be pieced 
out from the essentially complementary series that exist: the state-of- 
birth data of the decennial censuses, 1850-; the special migration re- 
ports of the 1940 and 1950 censuses; the annual migration estimates of 
the Current Population Survey, 1947-; and the Department of Agri- 
culture’s annual estimates of migration to and from farms, 1920-. An 
additional important series can be derived from census age-sex-color- 
nativity distributions by survival methods. All such estimates are in- 
exact but important migration streams are clearly delineated and the 
margin of error is probably less than that of most economic series to 
which they are related. 


HERE are no exact or nearly exact statistics of internal migration in the 

United States for such statistics are found only in continuous population 
registers. In Sweden, the Netherlands, and West Germany, for example, a mi- 
grant from one community to another is required by law to take out a migration 
certificate and to report to registration authorities both before and after migrat- 
ing. From these certificates statistics of in-migration and out-migration! can be 
compiled for any desired period by detailed origin and destination and by a con- 
siderable number of social and economic characteristics as well. 

In the United States, however, reliance must be placed upon census and 
survey data, and the most that can be obtained is estimates which, to some 
degree, compound migration with other types of population change and which 
contain other types of errors as well. Furthermore, there is no one body of 
materials which gives a reasonably complete picture of migration within the 
United States. Much, however, can be pieced out from the essentially comple- 
mentary series which exist. These are: the state-of-birth data of the decennial 
censuses, first collected in 1850; the five-year migration data of the 1940 census 
and the one-year migration data from the 1950 census; the Current Population 
Survey which first contained migration data in 1945; and the Department of 
Agriculture estimates of migration to and from farms, beginning in 1920. In 
addition, the age-sex-nativity data of the decennial censuses can be used in 
connection with birth and death registrations or with survival ratios to obtain 
migration estimates. Each of these types of data is discussed below, but before 
examining specific materials a few general remarks may be useful. 

The limitations of national censuses and surveys are such that almost the only 
questions relating to migration that can be asked are in the general form 
“Where was this person born?” or “Where was this person living at a specified 
date in the past?” Migration status is then determined by comparing state or 


* This paper was prepared to supplement statistics on internal migration in Historical Statistics of the United 
States, Colonial Times to 1957 (Washington, D. C., 1960) compiled under the joint sponsorship of the Bureau of 
the Census and the Social Science Research Council. 

1 By convention, internal migrants are referred to as in-migrants and out-migrants to distinguish them from 
immigrants and emigrants, the external migrants. The balance between in-migrants and out-migrants is net internal 
migration. 
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county of residence at the time of the census or survey with that at birth or 
at the specified date. 

Such questions have the effect of dividing the population into two groups: 
those living at the same place at both times, the “nonmigrants”; and those 
living in different places, the “migrants.” And, with questions which refer to a 
specific date a third group, those born between that date and the census are 
left in an anomalous position, neither migrant nor nonmigrant. In reality the 
“nonmigrant” group is not “pure” for, in addition to those who had not mi- 
grated, those who migrated but returned to the place of origin are included. 
The “migrant” group is correspondingly underenumerated. 

It is furthermore important to note that the data refer to migrants and not 
to migrations. In continuous population registers each migration is recorded 
and each can be classified by the characteristics of the migrant at the time of 
the migration. In census or survey data, however, only one migration—the most 
recent—can ordinarily be allotted to any one migrant and the characteristics 
recorded for the migrant are those at the time of the census or survey and not 
those at the time of the migration. The sum of the migrants at the end of a 
period can be far less than the number of migrations that occurred during the 
period. No matter how short the interval, some migrants die before its end, 
some return to the place of origin, and some migrate more than once. In general, 
the longer the period the less closely the number of enumerated migrants ap- 
proximates the number of migrations, and the less closely the characteristics 
of the migrants at the time of the census or survey resemble those at the time 
of the migration. 

Further difficulties arise from the use of states or counties as the geographic 
unit by which migration is defined. These, of course, are neither homogeneous 
nor constant units, and the result is that migration, so defined, is not the same 
thing from one time to another, nor within the same period, from one part of 
the country to another. Though nearly stable in the last few decades, both 
state and county boundaries shifted frequently in the past with marked changes 
in both number and size. At present the variation in size is great. Texas is 
220 times as large as Rhode Island, and San Bernardino County, California 
with 20,131 square miles is larger than each of the States of Connecticut, 
Delaware, Maryland, Massachusetts, New Jersey, Rhode Island, and Vermont. 
A single city may be divided between two or more counties—New York City 
contains five—and metropolitan areas sprawl across both state and county lines. 

It is, therefore, obvious that internal migration data for the United States 
are defective and must be used with caution. However, they are indispensable 
to the study of social and economic development and, in fact, are much better 
than most statistics used for that purpose. The data are such that generaliza- 
tions about trends and gross movements can be made with confidence, and it 
is only in regard to small and relatively unimportant shifts that doubt exists. 


I. STATE-OF-BIRTH DATA 
Available Materials 


The collection of data on the internal migration of its population was begun 
by the United States at the same time that a nativity classification was intro- 
daced into the census. Following the examples set by the City of Boston and 
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the State of New York in their censuses of 1845, the federal authorities included 
a question on state of birth of the native population in the schedule for 1850. 
Confident of the value of the resulting statistics, the Superintendent of the 
Census, Joseph Kennedy, made an interesting but unfulfilled prediction: 

“Another interesting branch of this inquiry is that which concerns the inter- 
migration of our native citizens among the States. The tables presenting a view 
of this movement will be most useful and valuable in tracing the progress of 
different portions of the country. The facts developed will show how far one 
section has impressed its own characteristics on others....The roving 
tendency of our people is incident to the peculiar condition of their country, 
and each succeeding Census will prove that it is diminishing. When the fertile 
plains of the West shall have been filled up, and men of scanty m*ans cannot 
by a mere change of location acquire a homestead, the inhabitants of each State 
will become comparatively stationary, and our countrymen will exhibit that 
attachment to the homes of their childhood, the want of which is sometimes 
cited as an unfavorable trait in our national character.”? 

State-of-birth data are in the form of distributions of state of birth by place 
of residence at the time of the census. Tabulations are found for state of 
residence at each census from 1850 through 1950; for cities from 1850 through 
1940 (with the exception of Washington which is shown for 1950 because it is 
coterminous with the District of Columbia); for counties in 1850, 1870, and 
1880; and for the rural and urban parts of states in 1910, 1920, 1930, and 1940. 

Data for states. The detail by which state-of-birth data have been published 
for states of residence has varied considerably in the 100 years over which they 
have appeared. For 1850 and 1860 the data refer to the free population, whites 
and free colored combined, but from 1870 on separate figures for whites and 
nonwhites can be obtained. Parentage of the native white population is given 
for the censuses from 1890 to 1930 and sex is shown for 1930 through 1950. 
Negroes can be distinguished from other nonwhites in the censuses of 1870 and 
1900-1930. The cross-classifications by color or race, parentage, and sex are 
shown for each census in Table 1. 

Perhaps the greatest deficiency in the state-of-birth data before 1950 was 
the lack of age classification. In 1940 special tabulations were made of children 
under 5 years old by color as a supplement to the five-year migration data of 
that census. A complete age classification was made for the first time in 1950. 
State of residence by division of birth and state of birth by division of residence 
were given at that census by color and sex for the following age groups: under 
5, 5-9, 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, and 70 and over. The full 
tabulation of state of birth by state of residence can be obtained as unpublished 
materials from the Bureau of the Census. 

In addition to these basic demographic characteristics, other cross-classifi- 
cations have appeared occasionally. In the census publications for 1850, deaths 
for the year ending June 1, 1850 were shown for 107 specific causes for states 
and parts of states (e.g. Middle Georgia) in which the deaths occurred by 
groups of states of birth (e.g. Northwest). For seven states (Indiana, Maryland, 
Massachusetts, Mississippi, Missouri, North Carolina, and Virginia) native 


® Report of the Superintendent of the Census for December 1, 1852, p. 15. 
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TABLE 1. STATE OF BIRTH DATA BY COLOR OR RACE, 
PARENTAGE, AND SEX, 1850-1950 


1850 | 1860 | 1870 | 1880 1900 | 1910 


Total native x 
Native white x xX 
Native parentage 
Foreign or mixed parentage 
Mixed parentage 
Foreign parentage 
Native nonwhite 
Native Negro 
Native Chinese 
Native Indian 
Native white and free 
colored combined 
Native males 
Native white x 
Native nonwhite x 
Native females 
Native white x 
Native nonwhite x 
birt! 


1 1850: For Connecticut and Louisiana, free colored, subdivided as black and mulatto, are shown by state of h. 


2 1870: Negroes designated as colored in census tables. 

# 1870: Chinese included 1 Japanese. 

4 1880 and 1890: Nonwhite designated as colored in census tables. 

* 1890 and 1900: Native parentage included persons with both parents native born, one parent native born 
and one with unknown birthplace, and those with both parents of unknown birthplace. 

6 1890 and 1900: Foreign or mixed parentage designated as foreign parentage in census tables. 

71930: 805,535 native Mexicans who would have been included with native white at other censuses were 
classified as native nonwhite. 


paupers in poor houses, convicts in penitentiaries, and inmates of jails and 
houses of correction were classified as in-state and out-of-state born. For 1850, 
1860, and 1870 state-of-birth data were published for the deaf and dumb, the 
blind, the insane, and the idiotic. 

Data for counties. Only limited data are available for counties. In 1850 the 
native free population of each county was divided into in-state and out-of- 
state born. In 1870 the classifications for counties were: total native population; 
the in-state born; and persons born in each of five selected states, the states 
varying from area to area. In 1880 the classifications were the same and there 
were nine selected states of birth instead of five. After 1880 there were no 
county data. 

Data for cities. At each census except 1950 state-of-birth data were published 
for a number of the largest cities, a list of which is given in Table 2 by the 
census in which they appear. The number included at each census varied as 
follows: 

1850 29 cities 

1860 9 cities 

1870 cities 

1880 cities 

1x90 124 cities (25,000 or more population) 
190 160 cities (25,000 or more population) 
1910 109 cities (50,000 or more population) 
1920 144 cities (50,000 or more population) 
1930 191 cities (50,000 or more population) 
1940 92 cities (100,000 or more population) 
1950 1 city (Washington, D. C.) 
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the State of New York in their censuses of 1845, the federal authorities included 
a question on state of birth of the native population in the schedule for 1850. 
Confident of the value of the resulting statistics, the Superintendent of the 
Census, Joseph Kennedy, made an interesting but unfulfilled prediction: 

“Another interesting branch of this inquiry is that which concerns the inter- 
migration of our native citizens among the States. The tables presenting a view 
of this movement will be most useful and valuable in tracing the progress of 
different portions of the country. The facts developed will show how far one 
section has impressed its own characteristics <n others....The roving 
tendency of our people is incident to the peculiar condition of their country, 
and each succeeding Census will prove that it is diminishing. When the fertile 
plains of the West shall have been filled up, and men of scanty means cannot 
by a mere change of location acquire a homestead, the inhabitants of each State 
will become comparatively stationary, and our countrymen will exhibit that 
attachment to the homes of their childhood, the want of which is sometimes 
cited as an unfavorable trait in our national character.”* 

State-of-birth data are in the form of distributions of state of birth by place 
of residence at the time of the census. Tabulations are found for state of 
residence at each census from 1850 through 1950; for cities from 1850 through 
1940 (with the exception of Washington which is shown for 1950 because it is 
coterminous with the District of Columbia); for counties in 1850, 1870, and 
1880; and for the rural and urban parts of states in 1910, 1920, 1930, and 1940. 

Data for states. The detail by which state-of-birth data have been published 
for states of residence has varied considerably in the 100 years over which they 
have appeared. For 1850 and 1860 the data refer to the free population, whites 
and free colored combined, but from 1870 on separate figures for whites and 
nonwhites can be obtained. Parentage of the native white population is given 
for the censuses from 1890 to 1930 and sex is shown for 1930 through 1950. 
Negroes can be distinguished from other nonwhites in the censuses of 1870 and 
1900-1930. The cross-classifications by color or race, parentage, and sex are 
shown for each census in Table 1. 

Perhaps the greatest deficiency in the state-of-birth data before 1950 was 
the lack of age classification. In 1940 special tabulations were made of children 
under 5 years old by color as a supplement to the five-year migration data of 
that census. A complete age classification was made for the first time in 1950. 
State of residence by division of birth and state of birth by division of residence 
were given at that census by color and sex for the following age groups: under 
5, 5-9, 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, and 70 and over. The full 
tabulation of state of birth by state of residence can be obtained as unpublished 
materials from the Bureau of the Census. 

In addition to these basic demographic characteristics, other cross-classifi- 
cations have appeared occasionally. In the census publications for 1850, deaths 
for the year ending June 1, 1850 were shown for 107 specific causes for states 
and parts of states (e.g. Middle Georgia) in which the deaths occurred by 
groups of states of birth (e.g. Northwest). For seven states (Indiana, Maryland, 
Massachusetts, Mississippi, Missouri, North Carolina, and Virginia) native 


9 Report af the Superintendent of the Census for December 1, 1362, p. 15. 
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TABLE 1. STATE OF BIRTH DATA BY COLOR OR RACE, 
PARENTAGE, AND SEX, 1850-1950 


1850 | 1860 | 1870 | 1880 1900 | 1910 


Total native x 
Native white x x 
Native parentage 
Foreign or mixed parentage 
Mixed parentage 
Foreign parentage 
Native nonwhite 
Native Negro 
Native Chinese 
Native Indian 
Native white and free 
colored combined 
Native males 
Native white 
Native nonwhite 
Native females 
Native white 
Native nonwhite 


1 1850: For Connecticut and Louisiana, free colored, subdivided as black and mulatto, are shown by state of birth. 

2 1870: Negroes designated as colored in census tables. 

3 1870: Chinese included 1 Japanese. 

4 1880 and 1890: Nonwhite designated as colored in census tables. 

+ 1890 and 1900: Native parentage included persons with both parents native born, one parent native born 
and one with unknown birthplace, and those with both parents of unknown birthplace. 

6 1890 and 1900: Foreign or mixed parentage designated as foreign parentage in census tables. 

71930: 805,535 native Mexicans who would have been included with native white at other censuses were 
classified as native nonwhite. 


paupers in poor houses, convicts in penitentiaries, and inmates of jails and 
houses of correction were classified as in-state and out-of-state born. For 1850, 
1860, and 1870 state-of-birth data were published for the deaf and dumb, the 
blind, the insane, and the idiotic. 

Data for counties. Only limited data are available for counties. In 1850 the 
native free population of each county was divided into in-state and out-of- 
state born. In 1870 the classifications for counties were: total native population; 
the in-state born; and persons born in each of five selected states, the states 
varying from area to area. In 1880 the classifications were the same and there 
were nine selected states of birth instead of five. After 1880 there were no 
county data. 

Data for cities. At each census except 1950 state-of-birth data were published 
for a number of the largest cities, a list of which is given in Table 2 by the 
census in which they appear. The number included at each census varied as 
follows: 

1850 29 cities 

1860 9 cities 

1870 cities 

1880 cities 

1x90 124 cities (25,000 or more population) 
19U0 160 cities (25,000 or more population) 
1910 109 cities (50,000 or more population) 
1920 144 cities (50,000 or more population) 
1930 191 cities (50,000 or more population) 
1940 92 cities (100,000 or more population) 
1950 1 city (Washington, D. C.) 


667 
1920 | 1930 | 1940 | 1950 
x x x x x x x 
: x x x x? 
xs xs | x x x? 
xX x x? 
x 
5 x x x 
x 
x x 
x x 
x 
x x 
x x 


AMERICAN STATISTICAL ASSOCIATION JOURNAL, DECEMBER 1960 


TABLE 2. STATE OF BIRTH DATA FOR CITIES, 1850-1940 


City 1850 | 1860 | 1870} 1880} 1890) 1900] 1910} 1920} 1930 


Akron x 
Albany x 
Allegheny 
Allentown 
Altoona 
Asheville 
Atlanta 
Atlantic City 
Auburn 
Augusta 
Austin 
Baltimore 
Bay City 
Bayonne 
Beaumont 
Berkeley 
Bethlehem 
Binghamton 
Birmingham 
Boston Mass. 
Bridgeport Conn. 
Brockton 
Brooklyn 
Buffalo 

Butte 
Cambridge 
Camden 
Canton 

Cedar Rapids 
Charleston 
Charleston 
Charlestown 
Charlotte 
Chattanooga 
Chelsea 
Chester 
Chicago 
Cicero 
Cincinnati 
Cleveland 
Cleveland Hts. 
Columbia 
Columbus 
Council Bluffs 
Covington 
Dallas 
Davenport 
Dayton 
Dearborn 
Decatur 
Denver 


Xx 
x 
x 
x 


xXx 


XxX 


x 
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1940 

x 
x 
x 
x 
x 
x 

x 
x 
x 
x 
x 

x 

x 
x 
x 
x 
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TABLE 2—(cont.) 


City 


1860 


1870 


1880 


1890 


Des Moines 
Detroit 
Dubuque 
Duluth 
Durham 
E. Chicago 
Easton 
E. Orange 
E. St. Louis 
Elizabeth 
Elmira 
El Paso 
Erie 
Evanston 
Evansville 
Fall River 
Fitchburg 
Flint 
Ft. Wayne 
Ft. Worth 
Fresno 
Galveston 
Gary 
Glendale 
Gloucester 
Grand Rapids 
Greensboro 
Hamilton 
Hammond 
Hamtramck 
Harrisburg 
Hartford 
Haverhill 
Highland Park 
Hoboken 
Holyoke 
Houston 
Huntington 
Indianapolis 
Irvington 
Jackson 
Jacksonville 
Jersey City 
Johnstown 
Joliet 
Joplin 
Kalamazoo 
Kansas City 
-« Kansas City 
Kenosha 
Knoxville 
La Crosse 


x 


x 


xX X 


XXX 


xX XXX 


State |1850| = 1900 | 1910} 1920| 1930} 1940 
Iowa xf x x x 
Mich. x x x 4 x 
Iowa x 
Minn. x x x x 
N.C. 

Til. 
Pa. 
Ill. x 
NJ. x x x 
x 
Texas 
Pa. x x x 
Til. 
Ind. x x 
Mass. x x x x x 
Mass. | 
Mich. x 
Ind. x x x 
Texas x 
Cal. 
Texas x 
Ind. 
Cal. 
Mass. | 
Mich. x x x % 
N.C. 
Ohio 
Ind. 
Mich. 
Pa. x 
Conn. x x x x x 
Mass. 
Mich. 
N.J. * 
Mass. 
Texas x 
W.Va. 
Ind. x x x x 
Mich. 
Fla. x x x 
x x x x x 
Pa. x x 
Ill. 
Mo. 
Mich. 
Kan. x x x x 
Mo. x x x x x 
Wisc. 
Tenn. x x 
Wisc. x 
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City 1860 | 1870| 1880 1890 


Lakewood 
Lancaster 
Lansing 
Lawrence 
Lexington 
Lincoln 
Little Rock 
Long Beach 
Long Is. City 
Los Angeles 
Louisville 
Lowell 

Lynn 

Macon 
Madison 
Malden 
Manchester 
McKeesport 
Medford 
Memphis 
Miami 
Milwaukee 
Minneapolis 
Mobile 
Montgomery 
Mt. Vernon 
Nashville 
Newark 
New Bedford 
New Britain 
Newcastle 
New Haven 
New Orleans 
Newport 
New Rochelle 
Newton 
New York 
Niagara Falls 
Norfolk 
Oakland 
Oak Park 
Oklahoma City 
Omaha 
Oshkosh 
Pasadena 
Passaic 
Paterson 
Pawtucket 
Peoria 
Phila. 
Pittsburgh 
Pontiac 


XX XK X 
xX X X X XxX 
XXX XXX 


x 


x 
x 
x 
x 


xx MN K KRM KR 
XxX -KXXX 


XxX XX X 
xX 


XX 


xXXXXXX 


XX XK XK 


| 1910/ 1920] 1930] 1940 
Ohio 
Pa. 
Mich. 
Mass. x x 
Ky. 
Neb. 
| Ark. 
Cal. x 
N.Y. 
Cal. x 
Ky. x x x x 
| Mass. x x 
Mass. x x 
Ga. 
Wisc. 
: Mass. 
N.H. x x 
Pa. 
Mass. 
Tenn. x x x x 
| Fla. x 
Wisc. x x x x 
Minn. x x 
Ala. x x x 
Ala. 
N.Y. 
Tenn. x x x x 
| N.J. x x x x x 
: Mass. x x 
Conn. 
Pa. 
Conn. 4 x x x x 
La. x x x x x 
Ky. 
N.Y. 
Mass. 
N.Y. x x x x x x 
N.Y. 
Va. x x 
Cal. x x 
Til. 
Okla. x 
Neb. x 4 
Wise. | 
Cal. 
N.J. | 
N.J. x x x x 
: x 
Til. x x 
Pa. x x x x x x 
Pa. x x x x 
Mich. 
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City 


1860 


1870 


1880 


1890 


Port Arthur 
Portland 
Portland 
Portsmouth 
Portsmouth 
Providence 
Pueblo 
Quincy 
Quincy 
Racine 
Reading 
Richmond 
Roanoke 
Rochester 
Rockford 
Sacramento 
Saginaw 

St. Augustine 
St. Joseph 
St. Louis 
St. Paul 
Salem 

Salt Lake City 
San Antonio 
San Diego 
San Francisco 
San Jose 
Savannah 
Schenectady 
Scranton 
Seat le 
Shreveport 
Sioux City 
Somerville 
South Bend 
So. Omaha 
Spokane 
Springfield 
Springfield 
Springfield 
Springfield 
Superior 
Syracuse 
Tacoma 
Tampa 
Taunton 
Terre Haute 
Toledo 
Topeka 
Trenton 
Troy 

Tulsa 


x 


x 
x 


x 


XX XK X 


xX X 


x X 


x 


KX RK 


XXX XX 


XXXX 


State | 1850) 1900 | 1910} 1920} 1930} 1940 
Texas x 
Me. x i | | x x x 
Ore. x x x x 
N.H. x 
Va. x 
R.I. x x x x x x x 
Colo. 
Ill. |_| 
Mass. 
Wisc. 
Pa. 4 | x 
Va. x x 
Va. 
N.Y. x x | x 
Ill. 
Cal. x 
Mich. 
Fla. x 
Mo. 
Mo. x x x x 
Minn. x 
Mass. 
Utah 
Texas 
Cal. 
Cal. x x 
Cal. 
Ga, x x 
Pa. x x x 
Wash. x 
La. 
Iowa 
Mass. x 
Ind. x 
Neb. 
Wash. x 
Til. 
Mass. x 
Mo. 
Ohio 
Wise, 
x x x x 
Wash. x x 
Fla. x 
Mass. 
Ind. x 
Ohio x x x x 
Kan. 
N.J. x x 
N.Y. x x x 
Okla. x 
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City 1860 | 1870| 1880} 1890 


Union City 
Utica x 
Waco 
Washington LC. x 
Waterbury 
Wheeling 
Wichita 
Wilkes Barre 
Williamsport 
Wilmington 
Wilmington 
Winston Salem 
Woonsocket 
Worcester 
Yonkers 

York 
Youngstown 


X 
XXX X 
x 


For 1850 data were presented for each city for the white and free colored 
populations combined, and for New Orleans and New York City the free colored 
were shown separately and divided into blacks and mulattoes. In 1860 the 
free population was given by sex. The racially most detailed tables were for 


1870 when whites, Negroes (called “colored”), Chinese and Indians were shown 
separately. For 1880, 1890, 1910, and 1920 data were published for total native 
population only, but in 1900 total native, native white of native parentage, and 
native white of foreign or mixed parentage were presented separately. For 1930 
total native population was given for all cities of 50,000 or more population 
and, in addition, Negroes were shown for the 86 of these cities with at least 
5,000 of this race. In 1940, total native population was given for all cities of 
100,000 or more population and nonwhites were shown for the 65 of these cities 
with 5,000 or more of this color group. For Washington, D. C. the same infor- 
mation was always available as for one of the states. 

Rural-urban residence. In 1910 and 1920 the total native and native white 
populations living in the urban and rural areas of each state were divided 
between those born in the state of residence and those born in other states. In 
1930 and 1940 these data were considerably expanded but the color break was 
omitted. At these two censuses individual states of birth were given for the 
urban, rural nonfarm, and rural farm residents of each state. 


Estimation of Migration from State-of-Birth Data 


For the purpose of estimating migration during a given period, state-of-birth 
data share with the other migration materials of the 1940 and 1950 censuses 
and the Current Population Survey the general defect that migration must be 
assessed from counts of living migrants. The state-of-birth question accomplishes 
the division of the native population into two major groups, those living in 


x 
x x 
x 
x | x 
x | x x | x x 
x x |x x 
x x |x | x 
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the state of birth and those living outside the state of birth. As such, this is 
highly useful information and can be handled to estimate the migration flows 
into and out of states, but for counties, cities, and rural or urban areas they 
do little to indicate the volume of migration. 

For any of these areas smaller than states a change in the number of persons 
born in other states or in any one of the other states may occur when no migra- 
tion has taken place; a negative change may result from mortality alone; and 
a positive change indicates only the in-migration of at least that number of 
people. Over a given period the change in the number of out-of-state born is 
» net which includes as the positive factor the in-migration during the interval 
vi Out-of-state born; and as negative factors the out-migration during the inter- 
val of the out-of-state born, and the deaths during the interval of the out-of- 
state born, regardless of the time of previous in-migration. Thus a change in 
the number of out-of-state born represents a mixture of interstate and intra- 
state migration and mortality and is affected by the migration of past periods. 
Furthermore, the migration of persons born within the state in which the city, 
county, or area is located does not enter into the calculation, though for most 
such areas this is probably the greater part of the total migration. 

State-of-birth data for counties and cities and rural and urban areas are useful 
for a number of other purposes, however. In indicating the composition of the 
native population by state of birth, they also indicate segments of the popula- 
tion which have migrated long distances and which were born and at least 
partly reared under different social and economic conditions. They also indicate 
whether a sudden influx has occurred. It is quite useful to know, for example, 
how many Negroes born in the South were living in San Francisco or New York 
City at a particular census and whether their number had greatly increased 
over the preceding census. To some degree these data also give a measure of 
the attractiveness of a particular city or county for the populations of nearby 
or distant states. The materials for rural and urban areas are particularly useful 
in indicating whether persons who move long distances are predominantly 
drawn to urban areas. 

All the advantages of state-of-birth data for units smaller than states also 
apply to states and, in addition, the state-of-birth data can be manipulated to 
give an adequate if not precise picture of the great migration flows within this 
country over the past century. The techniques are simple, amounting to little 
more than arithmetic. 

The following are the basic items in the census tabulations: 

(1) The number of persons born in each state and living there at the time 
of the census, a group loosely referred to as “nonmigrants.” 

(2) The number of persons born in each state and living in any other state. 

(3) The number of persons living in each state and born in any other state. 

From these the following quantities are easily derived: 

(1) The net gain or loss in the interchange between two states—by sub- 
tracting the number born in State X and living in State Y from the number 
born in State Y and living in State X. 

(2) The “in-migrants” into a state—the sum of the numbers born in other 
states and living in the state in question. 
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(3) The “out-migrants” from a state—the sum of the numbers born in the 
state and living in other states. 

(4) The “birth residence”* index—by subtracting (3) from (2) above. It is, 
therefore, the net balance between persons living in a state and born in other 
states and those born in that state and living in other states. 

The migration shown by the above indices, however, is without time reference 
since the only restriction on the time over which the indicated migration oc- 
curred is the span of life itself. The only possible solution to the problem posed 
by this ambiguity of the time period is to compute intercensal changes. For the 
three most important of these—those which yield the intercensal measures of 
in-migration, out-migration and net migration—the ways in which they fajl 
to be exact measures of intercensal migration are shown below. Migration df 
natives to and from places outside the continental United States is ignored.‘ 

1. In-migration—Intercensal change in the number of persons living in a 
state and born in other states. This is equivalent to: 

(a) Number of persons born in other states and migrating to State X during 
the intercensal period. 

(b) Minus deaths to persons born in other states (1) who were living in 
State X at the first census, or (2) who migrated to State X during the inter- 
censal period. 

(c) Minus out-migrants from State X who were born in other states and 
(1) who were living in State X at the first census, or (2) who had migrated to 
State X during the intercensal period. 

The amount of migration is clearly understated. Not all the in-migrants are 
included in term (a) since those born within the state are not included. The 
apparent number of in-migrants is also reduced by the death or subsequent out- 
migration of persons born in other states, regardless of the time of in-migration. 
These reductions are occasionally so large that “negative in-migration” is 
indicated. 

2. Out-migration—Intercensal change in the number of persons born in a 
state and living in other states. This is equivalent to: 

(a) Number of persons born in State X and migrating to other states during 
the intercensal period. 

(b) Minus deaths to persons born in State X (1) who were living in other 
states at the first census, or (2) who migrated to other states during the inter- 
censal period. 

(c) Minus in-migrants to State X who were born in State X but (1) were 
living in other states at the first census, or (2) who migrated from State X to 
other states during the intercensal period. 

The number of out-migrants is also understated. Not included in term (a) are 
the out-migrants who had been born in other states and who had previously 
in-migrated to State X. The apparent number of out-migrants is further reduced 


3 This term was used by C. Warren Thornthwaite in his Internal Migration in the United States, Study of 
Population Redistribution (Philadelphia: University of Pennsylvania Press, 1934), and has since become standard 
terminology. 

For a more complete discussion, see Everett 8. Lee and others, Population Redistribution and Economic 
Growth: United States, 1870-1950, Volume I, Methodological Considerations and Reference Tables (Philadelphia: 
The American Philosophical Society, 1957), pp. 57-64. 
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by the deaths in other states or the in-migration to State X of persons born in 
State X, regardless of the time of out-migration from State X. Thus, like the 
measure of in-migration, this index is affected by the migration of past decades, 
and a negative number of “out-migrants” can result from the calculation. 

3. Net migrants—Intercensal change in the birth-residence index, or the 
intercensal change in the number of persons born in other states and living in 
State X minus the number born in State X and living in other states, or 1. 
minus 2. above. With the necessary sign changes the terms implicit in this 
index are: 

(a) Number of persons born in other states and migrating to state X during 
the intercensal period. 

(b) Minus deaths to persons born in other states (1) who were living in State 
X at the first census, or (2) who migrated to State X during the intercensal 
period. 

(c) Minus out-migrants from State X who were born in other states and 
(1) were living in State X at the first census, or (2) who had migrated to State 
X during the intercensal period. 

(d) Minus number of persons born in State X and migrating from State X to 
other states during the intercensal period. 

(e) Plus deaths to persons born in State X (1) who were living in other states 
at the first census, or (2) who had migrated to other states during the intercensal 
period. 

(f) Plus in-migrants to State X who had been born in State X but (1) who 
were living in other states at the first census or (2) who had migrated from 
State X during the intercensal period. 

Though in-migration and out-migration are invariably understated by the 
state-of-birth method, net migration may be either under- or overestimated. 
All native in-migrants to State X are included in terms (a) and (f), and all 
native out-migrants are included in terms (c) and (d). If these were the only 
terms, a true number of net interstate migrants and the true net interstate 
migration would be obtained, but terms (b) and (e) are error terms. Thus the 
number of net migrants is overstated by the number of deaths in other states 
of persons born in State X, and understated by the number of deaths in State 
X of persons born in other states, regardless of the time of migration from the 
state of birth. Thus the index is affected by the migration of past decades and 
there may appear to be a net gain when a net loss actually occurred. 

One of the most important reasons for the use of state-of-birth data in his- 
torical studies is that they are the only statistics before 1940 which can be used 
to determine the distance and direction of migration flows. Both must be 
estimated by comparing state of birth with state of residence though much 
migration is between states neither of which is the state of birth. Thus a person 
who moves from Florida, his state of birth, to Texas to Wisconsin to Oregon to 
California to Georgia appears finally as a migrant from Florida to Georgia. The 
apparent south-north direction of the migration is in reality the resultant of 
east to west, south to north, east to west, north to south, and west to east 
directions of the separate moves, and the apparent short distance migration 
between two contiguous states is the resultant of a number of long moves. In 
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general, it can be said that the direction of major migration flows is clearly 
delineated in state-of-birth data, but that much of what appears to be long- 
distance migration is in reality the result of succeeding shorter moves away 
from the state of birth. 


Completeness and Reliability of State-of-Birth Data 


In addition to the general problem of the reliability of the data there are a 
number of special problems, some of which apply only to one census. In the 
early censuses the entire population was not enumerated, the complete area of 
the continental United States was not included in the censuses, and the same 
names do not always apply to the same geographic area. For 1850 and 1940 
there are known errors of some consequence in the published data, and the 
use of a 20 per cent sample for tabulation purposes introduces the question of 
sampling variation in 1950. In 1930 the Mexicans were classified as nonwhite 
instead of white as in all other censuses. 

Accuracy of responses. Just how accurate responses to state-of-birth questions 
are has been tested at only one census. The Post-Enumeration Survey of the 
1950 Census showed the following: 

“The results on the extent of reporting error indicate that an estimated 0.6 
per cent of the persons properly counted in the census were reported differently 
with respect to nativity in the Post-Enumeration Survey and in the census. The 
differences in reporting tended to cancel, however, and have only a negligible 
effect on the native and foreign-born population according to the statistics 
based on the complete count. ... With respect to the reporting of State of 
birth, it is estimated that another State of birth was obtained in the Post- 
Enumeration Survey than in the census for approximately 4,000,000, or about 
3 per cent of the persons properly included in the census counts for whom State 
of birth was reported.”® 

Unknowns. A related problem is that of unknowns. Until 1950 the census 
schedules were edited to assign a state of birth to persons for whom none was 
shown, whenever warranted by other information in the census schedules about 
the individual or members of his household. Before 1950 the number of un- 
knowns remaining after editing the schedules was relatively small, ranging from 
none in 1880 to less than 400,000 at other censuses. In 1950 the schedules were 
not edited in this fashion and the number of persons of unknown state of birth 
rose to 1,400,000, but even then, as at all other censuses, unknowns constituted 
less than one percent of the native population. 

Incomplete enumeration. At the earlier censuses state-of-birth information 
was not collected for the total population. As indicated in Table 1, slaves were 
not included in these tabulations in 1850 and 1860, and no attempt was made 
to achieve full coverage of the Indian population until 1900. Prior to this census 
only “civilized Indians,” that is those living among whites and presumably 
paying taxes, were enumerated. As indicated below in the discussion of the 
changing areas of states and territories, not all of the continental United States 
was included in the enumeration area before 1890. 


6 U. S. Census of Population: 1950, Volume IV, Special Reports, Part 4, Chapter A, State of Birth, p. 4. 
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Changing boundaries of states and territories. A special problem in the interpre- 
tation of state-of-birth materials is the changes in the boundaries of states and 
territories before the Census of 1910. Not only did the number of states increase, 
but the number of territories rose and fell. Some states and territories of the 
same name had quite different boundaries from census to census, and it is 
evident that respondents were often uncertain as to where they were born in 
terms of the states and territories as they were constituted at the time of 
the census. 

In 1850 the United States included 31 states, 4 territories, and the District 
of Columbia. Except that the State of Virginia contained also the area of 
present-day West Virginia, these states had essentially the same boundaries as 
today. Of the four territories, Minnesota Territory contained not only the 
present state of the same name but also parts of North and South Dakota. 
Utah Territory was composed of all of present-day Utah and parts of Colorado, 
Nevada, and Wyoming. Oregon Territory included all of Idaho, Oregon, and 
Washington as well as a small portion of Montana, and New Mexico Territory 
contained parts of Arizona, Colorado, Nevada, and New Mexico. The re- 
mainder of the country consisted of Indian or unorganized lands or belonged to 
Mexico (the Gadsden Purchase area) and was not enumerated. As a place of 
birth the territories were shown as a pooled category except for persons born 
and living in the same territory. 

In 1860 data were reported for 34 states, 7 territories, and the District of 
Columbia. Except for Virginia, the states had essentially their present bound- 
aries. Of the territories, Dakota Territory included not only North and South 
Dakota but most of Montana and part of Wyoming. Nebraska Territory was 
composed of Nebraska and part of Wyoming. Washington Territory included 
all of Washington and Idaho and part of Montana. New Mexico Territory 
contained all of the states of New Mexico and Arizona and a small part of 
Nevada, and Utah Territory was composed of Utah and parts of Nevada and 
Wyoming. Nevada Territory contained only the western portion of what was 
to become that state. Although Colorado Territory was not formed until 1861 
(with present boundaries) data were published for this area in the Census of 
1860. For the territory-born no data were presented as to specific territory of 
birth. No enumeration was made in Indian Territory or in the unorganized 
lands, which together constituted the present State of Oklahoma. 

By 1870 nearly all the states had attained their present boundaries. North 
and South Dakota, however, were still combined as Dakota Territory. Indian 
Territory and the unorganized lands, which together comprised present-day 
Oklahoma, were not enumerated. Although the State of West Virginia had been 
formed from the 48 western counties of Virginia in 1863 (two more were an- 
nexed in 1866), all persons born in either West Virginia or Virginia were classi- 
fied as having been born in Virginia. It has been customary to assume, how- 
ever, that the 381,297 Virginia-born persons residing in West Virginia were 
living in the state of birth. 

In 1880, as in 1870, North and South Dakota were still combined as Dakota 
Territory and no enumeration was made in what is now Oklahoma. 

In 1890, Indian Territory and the Indian reservations, which had not pre- 


i 
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viously been enumerated, were specially enumerated. Those areas contained 
117,368 native whites and 208,083 native nonwhites for whom no state-of-birth 
distribution was made. The enumeration in Oklahoma Territory listed 1,045 
persons as born in Indian Territory. Since Indian Territory and Oklahoma 
Territory were later combined to form the State of Oklahoma, persons born 
in Indian Territory and living in Oklahoma Territory are usually classified as 
living in the state of birth. 

In 1889 North and South Dakota were formed from Dakota Territory, and 
the merging of Indian Territory and Oklahoma Territory into the State of 
Oklahoma in 1907 marked the end of boundary changes of any consequence. 

Errors in the 1850 Census. Despite what were considered explicit instructions 
to the enumerators, many errors were made in filling out the schedules of that 
census. In his “Remarks upon the schedules of 1850, etc.”, J. D. B. Debow 
noted the following: 

“Blanks in the nativity column sometimes extend to whole pages. These 
blanks were considered in the office to mean that the person was born in the 
State, as the only probable construction. Frequently, after naming a dozen or 
more persons born in the State, a person is mentioned born in another State; 
then a dozen follow with the usual check, (“) though it is evident that the last 
belonged to the State of the first mentioned.”® 

The returns of the Seventh Census were published in 1853, an abstract having 
appeared earlier the same year, and many inconsistencies and arithmetic in- 
accuracies existed among the different tables contained in that report. Table 
XV (pp. xxxvi-xxxvii) presented data on the place of birth of the free residents 
of each state or territory by state if native born and by country if foreign born, 
together with the number of persons of unknown birthplace. This table was 
prefaced with these remarks: 

“The following table, giving specifically the places of birth of the inhabitants 
of all the States, was published in the Abstract Report of the Census ordered 
to be printed by the last Congress. It seems proper to include it among the ag- 
gregate tables of the present volume, although time has not admitted of its 
examination, and although in many particulars it does not agree with the other 
published results; and in some cases, as in regard to the number of natives of 
California residing in Connecticut, and in regard to the number of native-born 
residents of the Territories, objections have been raised, etc.”7 

Tables XVI and XVII (based upon Table III for each state) of the” same 
report (p. xxxvill) gave for whites and for free colored, separately by sex, the 
number of persons born in the state where residing, born out of the state but 
in the United States, born in foreign countries, and of unknown birthplace. A 
comparison of these tables with Table XV shows that the latter included 8 more 
persons in the total population, 29,279 more in the native population, 167,853 
more living in the state of birth, 33,763 fewer foreign born, and 4,492 more of 


* The Seventh Census of the United States: 1850, p. iv. 

1 Ibid., p. xxxvi. In using this table, Joseph A. Hill wrote, “These figures are of necessity derived from the 
birthplace table in the census report for 1850, whieh is not in agreement with the other tables in that report. It is 
believed, however, that the margin of error is not great enough to affect the validity of the general conclusions and 
— ns based upon these figures in this study of interstate migration.” (“Interstate Migration” in Special Re- 

pplementary Analysis and Derivative Tables, Twelfth Census of the United States: 1900, p. 281) 
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unknown birthplace. Since these differences affected all the states, sometimes 
by large amounts, a state-by-state comparison is shown in Table 3. 

In 1854 a Compendium of the Seventh Census appeared which differed from 
both earlier sets of data. The detailed place of birth table, Table XV of the full 
report, was repeated in the Compendium with several small changes affecting 


TABLE 3. WHITE AND FREE COLORED POPULATION BY PLACE OF 
BIRTH, 1850, AS SHOWN IN TWO SOURCES, BY STATES 


Born and living 


Native Foreign born Unknown in same state 


State or territory 


A* Bt A* Bt A* Bt 


United States 17,737,578 |17,708,299 |2,210,839 |2,244,602 13,624,902 |13,457,049 
Maine 551,129 | 550,878] 31,456] 31,825 517,117 | 515,583 
New Hampshire 304,227 | 303,563] 13,571 14,265 261,591 | 258,471 
Vermont 280,966 | 280,055] 32,831} 33,715 232,086 | 228,941 
Massachusetts 830,066 | 827,430] 160,909} 164,024 695,236 | 685,324 
Rhode Island 124,299 | 123,564] 23,111] 23,902 102,641 101,260 
Connecticut 332,525 | 331,560] 37,473 | 38,518 292,653 | 289,984 
New York 2,439,296 | 2,436,771 | 651,801 | 655,929 2,151,196 | 2,129,651 
New Jersey 430,441 | 428,940] 58,364] 59,948 385,429 | 382,120 
Pennsylvania 2,014,619 | 2,006,207 | 294,871 | 303,417 1,844,672 | 1,825,078 
Ohio 1,757,556 | 1,757,746 | 218,512 | 218,193 1,219,432 | 1,215,876 
Indiana 931,392 | 930,458] 54,426] 55,572 541,079 | 525,732 
Illinois 736,931 | 736,149 | 110,593 | 111,892 343,618 | 333,753 
Michigan 341,591 | 341,656] 54,852] 54,703 140,648 | 138,427 
Wisconsin 197,912 194,099 | 106,695 | 110,477 63,015 54,479 
Minnesota Terr. 4,007 4,097 2.048 1,977 1,334 1,586 
Iowa 170,620 | 170,931] 21,232] 20,969 50,380 41,357 
Missouri 520,826 | 517,100] 72,474] 76,592 277,604 | 266,934 
Delaware 83 ,968 83,978 5,211 5,253 72,351 72,523 
Maryland 438,916 | 441,108 | 53,288] 51,209. 400,594 | 399,396 
BC. 42,956 43,033 4,967 4,918 24,967 24,372 
Virginia 926,154 | 925,677 | 22,394 | 22,985 872,923] 867,691 
North Carolina 577,750 | 577,693 2,524 2,581 556,248 | 556,301 
South Carolina 274,813 | 274,759 8,662 8,707 262,160 | 262,016 
Georgia 518,079 | 517,450 5,907 6,488 402,666 | 397,560 
Florida 45,320 45,355 2,757 2,769 20 563 19,924 
Kentucky 740,881 | 738,671 29,189 | 31,420 601,769 | 587,797 
Tennessee 755,655 | 756,019 5,740 5,653 585,084 | 585,835 
Alabama 420,032 | 420,245 7,638 7,509 237,542 | 236,332 
Mississippi 291,114 | 291,352 4,958 4,788 140,885 | 136,141 
Arkansas 160,345 | 160,536 1,628 1,471 63 ,206 61,289 
Louisiana 205,921 | 204,039 | 66,413] 68,233 145,474 | 142,119 
Texas 137,053 | 136,272] 16,774] 17,681 49,160 43,444 
New Mexico Terr. 59,261 59,187 2,063 2,151 58,421 58,415 
Utah Terr. 9,355 9,300 1,990 2,044 1,381 1,163 
Oregon Terr. 11,992 12,081 1,159 1,022 3,175 2,410 
California 69,610 70,340 | 22,358 | 21,802] 629 6,602 7,765 


* The Seventh Census of the United States: 1850, Table XV pp. xxxvi-xxxvii. 
t Ibid., Tables XVI and XVII, p. xxxviii and Table III for each state. 


the native population, one being the elimination of the disputed persons born 
in California and living in Connecticut. The Compendium also repeated 
(on pp. 61 and 79) Tables XVI and XVII of the full report with differences for 
three states, Connecticut, Georgia, and Virginia. These were usually small but 
in one instance reached 1300. 


* The other affected categories were born and living in Kentucky, and unknown birthplace for the United 
States, New Jersey, Ohio, and Tennessee. 
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Treatment of Mexicans in the 1930 Census. Except for changing coverage 
there are no special problems with the censuses of 1860 through 1920, but in 
1930 a change in classification was made which makes it very difficult to com- 
pare results for this census with those at other censuses. At this census Mexicans 
were classified as nonwhite for the only time in census history. As a result the 
native nonwhite group was increased by 805,535 as compared with other 
censuses. However, only California and the Southwestern States were seriously 
affected by this shift in classification. 

Errors in 1940. A number of errors, some quite large, have been discovered 
in the published tables for 1940. A comparison between the published and cor- 


TABLE 4. PUBLISHED AND CORRECTED STATE OF BIRTH DATA, 
NATIVE WHITE MALES RESIDING IN VIRGINIA, 1940 


State of birth Published Corrected 


California 1,454 1,443 
Colorado 515 514 
Florida 2,006 2,005 
Idaho 177 
Illinois 3,409 
Indiana 2,364 
Iowa 1,404 
Kansas 
Minnesota 
Montana 
Nebraska 
New York 
North Dakota 
Ohio 

Oregon 

South Dakota 
Texas 

Utah 

Virginia 
Washington 
West Virginia 
Wisconsin 
Wyoming 


rected figures, as obtained from unpublished materials of the Bureau of the 
Census, is shown in Table 4. These errors affect only native white males re- 
siding in rural farm areas of Virginia. 

Sampling variability and processing errors in 1950. In 1950, for the first time, 
the published state-of-birth data were based upon a sample. Although infor- 
mation was collected on a full count basis, the state-of-birth tabulations were 
made for a 20 per cent sample. As a result questions of sampling design and 
variability were introduced and, in addition, errors were made in the tabulation 
process. 

The 20 per cent sample chosen for tabulation was that on which other 
sample items were based. This was drawn from every fifth line of the census 
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schedule, but a perfect sample was not obtained because of blanks, voided lines, 
special entries, the tendency of enumerators to avoid putting people on a sample 
line when it entailed more work (e.g. for children falling on a sampling line fewer 
blanks were to be filled than for adults falling on such lines). The most im- 
portant result of this type of error was a 1.45 per cent shortage of males 25 and 
over in the sample as compared with the full count. 

In the processing of the cards there were inevitably machine errors, loss of 
cards, and other types of errors. Unlike the earlier censuses, however, there was 
no attempt to adjust tabulations to bring them into exact agreement if the dif- 
ferences fell within tolerance limits which were considered insignificant. And, 
since only the cards for natives born outside the state of residence or for whom 
state of birth was unknown were tabulated, the number of persons born in the 
state of residence had to be obtained by subtraction from the total native 
population. The effect was to reproduce in the populations born in the state of 
residence the net errors from other categories. The joint effect of sampling bias, 
sampling variability, and processing errors cannot be precisely estimated. In 
general, it is probably small but it is obvious that even more care than usual 
should be used in interpreting small basic figures or small differences. 


II. FIVE-YEAR MIGRATION DATA OF THE 1940 CENSUS 


Until 1940 the state-of-birth data and the Department of Agriculture esti- 
mates of movements to and from farms were the only statistics of internal mi- 
gration for the United States. Since the latter show only the shift to and from 


farms and cannot be used to indicate movements from one part of the country 
to another, this meant that the bulk of the information about migrants was 
restricted to information on migrants between states, that the shortest period to 
which the migration could be related was the ten years or so between censuses, 
and that little was known about the characteristics of migrants. Furthermore 
there was no information on the internal migration of the foreign born. 

During the depression years of the 1930’s, however, considerable attention 
was focused upon internal migration. There were reports, substantiated in 
part by Department of Agriculture estimates, that a back-to-the-land move- 
ment had occurred, and the streams of “Okies” and “Arkies” into California 
aroused national attention, especially when California tried to bar their entry. 
Under the New Deal there was an unprecedented increase in national and 
regional planning, and it became obvious that migration data were needed for 
periods shorter than census intervals, that the whole population should be 
covered, that intrastate as well as interstate migration should be shown, and 
that the characteristics of migrants should be determined. 

Outside pressures for such information and the growing interest of the pro- 
fessional staff of the Bureau of the Census in internal migration resulted in the 
inclusion in the 1940 census schedule of the question, “In what place did this 
person live on April 1, 1935?” On the basis of answers to this question popula- 
tions were divided into migrants and nonmigrants, and the migrants were 
classified by origin and destination. 

The definitions of migrants and nonmigrants at this census set a pattern 
which has been followed with little variation in succeeding censuses and popu- 
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lation surveys. Nonmigrants were defined as persons living in the same county 
or quasi county in 1935 as in 1940, and in some reports nonmigrants were 
divided into persons living in the same house at both times and those living in 
different houses in the same county or quasi county. Migrants, then, were 
persons who had changed residence from one county or quasi county to another, 
and three groups were distinguished: intra-state migrants, migrants between 
contiguous states, and migrants between noncontiguous states. All cities of 
100,000 or more were considered quasi counties and, where the city did not in- 
clude the whole county, the remainder constituted another quasi county. In- 
dependent cities were treated as counties, regardless of size. One especially un- 
fortunate bit of terminology was used. All persons living in the continental 
United States in 1940 but who had been living elsewhere in 1935 were termed 
“immigrants,” regardless of whether they were natives or foreign born or 
whether they had been in the territories or possessions or in foreign countries. 

Four special reports on migration, each a volume in itself, were issued at this 
census and additional materials were included in the reports on family char- 
acteristics and on differential fertility. The first of the special reports, Color and 
Sex of Migrants, was designed to show the flow of migrants from one part of 
the country to another and to indicate the division of the population by migra- 
tion status for the largest possible number of geographic subdivisions. Non- 
migrants, by place of residence in 1940, and migrants, cross-classified by 
residence in 1935 and in 1940, were shown for states, urban and rural parts of 
states, and cities of 100,000 or more. 

In succeeding reports the detailed listing of origin and destination was cur- 
tailed in order to present more characteristics of migrants and nonmigrants. 
In the second report, Age of Migrants, eleven age groups (under 5, 5-13, 14-17, 
18-19, 20-24, 25-29, 30-34, 35-44, 45-54, 55-64, and 65 and over) were shown 
by color and sex. For each of the types of areas shown above migrants can be 
compared with nonmigrants, but only for regions is there a cross-classification 
of origin and destination. A similar plan was followed in the two following 
reports. 

The third report, Economic Characteristics of Migrants, contained data on 
the employment status and major occupation group of migrants and non- 
migrants aged 14 and over, by sex. The fourth report, Social Characteristics of 
Migrants, dealt with nativity and citizenship, relationship to head of household, 
and education in terms of years of school completed. Again a classification of 
migrants and nonmigrants was made by sex. The education data were re- 
stricted to the age group 25-34. 

Important data were also presented in the series on differential fertility and 
family characteristics. In Differential Fertility 1940 and 1910, Women by Number 
of Children under 5 Years Old, native white and Negro women aged 15-49 were 
classified by five-year age groups, migration status, marital status, number of 
children under 5 per 1,000 women, and by specific number of children under 5. 
Native white women were further classified by urban-rural residence in 1935 
and 1940. These data were presented for the United States and for regions. For 
the United States an additional classification was made for years of school 
completed for native white females aged 15-49 by five-year age groups. Not 
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only were these the only data on the fertility of migrants, but they also con- 
tained the only information on the marital status of migrants and on education 
by color or for ages outside the range 25-34. 

In the publications on characteristics of families, data are found on the mi- 
gration status of family heads. Migrants can be compared with nonmigrants 
in regard to size of family, age of head, tenure, and rent. It was in these reports 
and those on differential fertility that the division of nonmigrants into those 
living in the same house at the opening and close of the period and those living 
in different houses in the same county was introduced. 

Though they set a new standard for information on internal migration, there 
were specific shortcomings of the 1935-1940 data which reduced their usefulness. 
One of the most important of these was the biased reporting of rural-urban 
origin due to the tendency of migrants from rural areas to give the name of a 
nearby city as the place of origin. While no quality check was made on the 1940 
_ census data, the results of a reliability check of a special census in Wilmington, 
' N.C. in 1946 illustrates the possible magnitude of this bias, 

Following this census a sample of about one-fifth of the households enumer- 
ated were checked by a second set of enumerators in regard to the answer to 
the question, “Where was this person living on April 1, 1940?” Eighteen per 
cent of those who were reported as living in urban places in the census werefound 
by the second enumerator, who had been specially trained and who asked addi- 
tional questions, to have lived outside city limits. This led Shryock to con- 
clude that: 

“These findings convinced us that there was little hope of obtaining accurate 
information in 1950 on the urban or rural origin of migrants. This conclusion 
was made in the light of the probahle number of migration questions that might 
be asked (four at most), the probable capacities of the 150,000 to 200,000 who 
would be hired for the occasion, and the probable two days of paid training for 
the Censuses of Population, Housing, and Agriculture.”® 

Another instance of misreporting which, because they are the only data of 
their kind, is of some importance was the considerable misclassification of per- 
sons on public emergency work in the statistics on employment status.!° While 
the payrolls of Federal emergency agencies at about the time of the census 
showed 3,377,978 persous on public emergency work, including the NYA stu- 
dent program, only 2,529,606 persons were reported in the census within this 
category. The amount of the misclassification varied greatly from state to state 
with many emergency workers reporting themselves as “at work” and persons 
in the NYA student program were frequently returned as “in school” and 
therefore not in the labor force. 

In some instances—relationship to head of household, for example—the 
comparison between migrants and nonmigrants was biased because of the in- 
clusion of all children under 5 years old in the nonmigrant category because 
they had no residence in 1935. The possible effect is suggested by statistics on 


® Henry S. Shryock, Jr., “Measurement of Internal Migration” (unpublished paper read before the Population 
Association of America at Philadelphia, Pennsylvania on May 23, 1948). 

10 Sixteenth Census of the United States: 1940, Population, Internal Migration 1936 to 1940, Economic Charac- 
teristics of Migrants, pp. 3-4. 
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the state of birth of children under five years old in 1940." Thirty-three per cent 
of these children who were born in the District of Columbia were living else- 
where at the time of the census, and 14 per cent of this age group who were 
living in the District of Columbia were born elsewhere in the United States. 
Even for states the proportion could be high. Ten per cent of those born in 
Oklahoma were living in other states in 1940, and ten per cent of those living 
in California were born in other states. 

Other criticisms of the 1940 census materials on migration which have more 
than minor importance were the use of states as the smallest units by which 
most of the data were published and the lack of crossclassification. For many 
purposes the state is too large for good analysis. An attempt to remedy this 
defect was the tabulation of migranis for selected subregions by the Scripps 
Foundation of the Bureau of Agricultural Economics in conjunction with the 
Bureau of the Census.” 

The lack of crossclassification was perhaps a more serious defect. In the two 
important reports on economic and social characteristics of migrants there was 
no classification by color and no control for age, except the restriction of the 
data on education to the 25-34 age group and that on economic characteristics 
to the population aged 14 and over. Inasmuch as color and age are variables 
of utmost importance in employment status, occupation, nativity, citizenship, 
household relationship, and education, the analytical possibilities of these data 
are greatly limited. 


III. ONE-YEAR MIGRATION DATA IN THE 1950 CENSUS 


The collection and tabulation of migration data in 1950 differed in a number 
of important ways from that of 1940. The migration questions with a specific 
time reference were asked of a twenty per cent sample instead of the whole 
population as in 1940. The period covered was one year instead of five, and 
the basic unit for which much of the detailed data were tabulated was the state 
economic area instead of the state. Farm and nonfarm residence was substituted 
for the urban-rural classification of area of out-migration, and much of the data 
were classified in terms of metropolitan and nonmetropolitan subregions. For 
several differentials the crossclassification by the important demographic vari- 
ables was greatly improved. 

For a twenty per cent sample of the population aged one year and over the 
question was asked, “Was he living in the same house a year ago?” If the answer 
was “No,” these questions followed: “Was he living on a farm a year ago?” 
“Was he living in a different county a year ago?” For persons living in a differ- 
ent county in the continental United States the county and state were recorded, 
and for persons living outside the continental United States the country, ter- 
ritory, or possession was listed. 

In the tabulations the population was divided into nonmovers, those living in 


Nl Sixteenth Census of the United States: 1940, Population, State of Birth of the Native Population, Table 36. 

12 Donald J. Bogue, Henry 8. Shryock, Jr., and Siegfried A. Hoermann, Subregional Migration in the United 
States, 1935-40, Volume I, Streams of Migration Between Subregions (Miami, Ohio: Scripps Foundation Studies in 
Population Distribution, Number 5, 1957) and Donald J. Bogue and Margaret Jarman Hagood, Subregional Migra- 
tion in the United States, 1935-40, Volume II, Differential Migration in the Corn and Cotton Belts (Miami, Ohio: 
Scripps Foundation Studies in Population Distribution, Number 6, 1953). 
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the same house in 1949 and 1950; the movers, those living in different houses; 
those living abroad in 1949; and persons for whom migration status was not 
recorded. Movers were divided into intracounty movers and migrants, again 
defined as persons who had changed residence from one county to another, 
but with no quasi counties constituted. Persons abroad was the same group as 
the “immigrants” in 1940 and included natives as well as the foreign born, 
persons living in territories and possessions as well as in foreign countries. 

Migration data for 1950 are scattered through the general reports of the Cen- 
sus and are contained in a number of special reports. In Volume II, Character- 
istics of Population, statistics on the mobility status of the population in states, 
counties, cities, and small places are given, and in Volume III these statistics 
are extended to census tracts. And in addition to the three special reports on 
migration, migration data are included in the special reports entitled Occupa- 
tional Characteristics, General Characteristics of Families, Institutional Popu- 
lation, Characteristics by Size of Place, and Education. 

In Occupational Characteristics the mobility status of the experienced civilian 
labor force is shown by sex and detailed occupation for the United States. In 
General Character?stics of Families the mobility status of family heads is given 
by age of head and type of family for urban and rural areas in the United States. 
The special report, Institutional Population, contains data on the mobility of 
the institutional population by type of institution, age, and sex, for the United 
States. Characteristics by Size of Place gives mobility status by age, sex, color, 
and size of place for the United States and for regions. In the special report, 
Education, the mobility status of children aged 5 to 13 is shown by school en- 
rollment, grade in which enrolled, age, color, and sex for the United States, the 
South, and the North and West combined. For the population aged 14 and over 
years of school completed are shown by age, sex, color, and migration status 
for the United States and for regions. 

Most of the data on migration flows and the detailed characteristics of 
migrants and nonmigrants for areas smaller than regions are found in the three 
special reports on mobility: Population Mobility—States and State Economic 
Areas; Population M obility—Characteristics of Migrants; and Population M obil- 
ity--Farm-Nonfarm Movers. The first of these shows the flow of migrants from 
state to state by color, and the mobility status of the populations of states and 
state economic areas, by a number of demographic and socioeconomic char- 
acteristics. These are: color by sex, age, marital status (ages 14 and over), 
years of school completed (ages 25 and over), employment status (ages 14 and 
over), major occupation group of employed males, and family income in 1949. 
For states thes¢ are given without a color break for total population, intra- 
county moveref migrants within state economic areas, and migrants between 
state economid areas. For state economic areas the corresponding categories 
are total population, intracounty movers, migrants within the state economic 
area, in-migrants to the area, and out-migrants from the area. Similar tabula- 
tions are given for nonwhites in state economic areas with 25,000 or more of 
this color group. 

In Population M obility—Characteristics of Migrants information on the char- 
acteristics of mobile and nonmobile persons is extended by a detailed age cross- 
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classification (1-4, 5-6, 7-9, 10-13, 14-19, 20-24, 25-29, 30-34, 35-44, 45-64, 
65 and over). For ages under 14 only three characteristics are shown: color by 
sex, type of migration by sex, and relationship to head of household. For ages 
14 and over these are repeated and, in addition, marital status by sex, education 
by color (for age groups over 25 years), employment status by sex, major oc- 
cupation group of males, and personal income of males are given. For the United 
States a further crossclassification is made by type of residence in 1949 and in 
1950 (metropolitan and nonmetropolitan state economic area, further divided 
into farm and nonfarm). For regions and selected divisions the characteristics 
are shown for in-migrants, out-migrants, and migrants within the area, and 
interregional migrants are also crossclassified by region of residence in 1949 and 
region of residence in 1950. 

The third report, Population Mobility—Farm-Nonfarm Movers, shows the 
movement to and from farms. For state economic areas the intracounty mov- 
ers, migrants within the area, in-migrants to the area, and out-migrants from 
the area are shown by farm and nonfarm residence in 1949 and urban, rural- 
nonfarm and rural farm residence in 1950. For economic subregions the follow- 
ing characteristics of movers by sex, farm-nonfarm residence in 1949 and farm- 
nonfarm residence in 1950 are given: urban-rural residence in 1950, type of 
mobility (within a county, between counties of a state, between contiguous 
states, and between noncontiguous states), color, age, marital status (ages 14 
and over), years of school completed (ages 25 and over), employment status 
(ages 14 and over), major occupation group (employed persons), and family 
income in 1949. For economic subregions with 50,000 or more nonwhites in 1950, 
these tabulations were repeated for nonwhites. 

The state economic areas shown in these reports are single counties or com- 
binations of counties, chosen because of economic and social similarity, to 
divide the states into a manageable number of relatively homogeneous units. 
There are 501 such areas but some of the least populated agricultural areas 
were combined for the migration tabulations so that the number was reduced 
to 443. These, in turn, were recombined for the report on farm-nonfarm movers 
into 119 subregions which, unlike state economic areas, cut across state lines. 

The tabulation of migrants for state economic areas and economic subre- 
gions has served a useful purpose inasmuch as the flow and characteristics of 
migrants and nonmigrants can be observed for relatively homogeneous areas. 
However, it should be noted that these areas vary even more in size than do 
counties or states. Litchfield County, Connecticut is one state economic area 
and the entire State of Nevada is another. As compared with some eastern 
subregions which correspond roughly to standard metropolitan areas the State 
of Nevada is combined with parts of Arizona, California, Oregon, and Utah to 
form an economic subregion. In general, state economic areas and subregions 
are much smaller in the east than in the west. 

More important than the variation in size is the loss of ability to relate migra- 
tion to much of the available social and economic data. A great deal of other 
data is assembled only for states and some of the most important data relating 
to migration are found only for state economic areas or economic subregions. 
For example, no age distributions of in-migrants to states (except for Nevada) 
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can be obtained since this was tabulated only for state economic areas and 
economic subregions. 

As with the state-of-birth data for 1950 these migration statistics are affected 
by sampling variability and, more important, by the failure of enumerators to 
follow instructions exactly and errors in the tabulation process. Already men- 
tioned in the discussion of state-of-birth data for 1950 was the under-enumera- 
tion of adult males in the twenty per cent sample, a shortage of 1.45 per cent 
for the United States as a whole as compared with the total count. Again there 
was no attempt to reconcile totals from one run of the cards with another or to 
achieve internal consistency if the differences were considered insignificant. 
And, as with the state-of-birth statistics, residual categories, in this case the 
nonmobile populations, were obtained by subtracting obtained counts from 
known totals. 

In the Post-Enumeration Survey 8.2 per cent of the respondents were re- 
ported differently in regard to mobility status than in the census. Most of these 
errors, however, were offsetting, though it is noted that the number of intra- 
county movers seemed slightly understated and the number of persons abroad 
in 1949 overstated. Another measure of the consistency of response was af- 
forded by the Current Population Survey for roughly the same period (March 
1949 to March 1950). For the United States the percentage distributions by 
mobility status are shown below: 


Current Population 1950 


Residence in 1949 Survey Census 


Same house 
Different house, same county 
Different county, same state 
Different state 

Abroad 


= 


© 
CN et 


One of the presumed advantages of the shorter migration period than in the 
1940 census was better recall and a smaller number of “unknowns” in the data. 
Actually, however, there is no evidence of better recall and the number of un- 
knowns, partly because of the lack of editing the schedules, was high. Of the 
population aged one and over 2,547,450 were listed as “mobility status not 
reported.” This number was 1.7 per cent of the total population and larger 
than either the number of migrants between contiguous states or that be- 
tween noncontiguous states. It is of value to note that in the Post-Enumeration 
Survey a higher proportion of mobile than of nonmobile persons were found 
among the group with “mobility status not reported” in the census. 


IV. CURRENT POPULATION SURVEY 


Shortly after the Census Bureau assumed responsibility for the Monthly 
Report on the Labor Force and converted it into the Current Population Sur- 


1 U. S. Census of Population: 1950, Volume TV, po ae Reports, Part 4, Chapter B, Population Mobility— 
States and State Economic Areas, p. 8. 
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vey, it began presenting reports on the mobility of the population of the United 
States. The first reports, issued from 1945 to 1948, covered variable periods: 

December 1941 to March 1945, April 1940 to February 1946, April 1940 to 

April 1947, August 1945 to August 1946, and August 1945 to October 1946. 

The value of such reports was greatly increased in 1948 when an annual series, 

covering the period 1947-1948 down to the present, was inaugurated. For all 

except 1949-1950, 1955-1956, and 1957-1958, where the surveys covered the 

year from March to March, the month of reference has been April. 

These mobility statistics are based upon the “labor force sample” which, ex- 
cept for the survey for 1940-1947 when the sample was somewhat larger, 
covered about 25,000 households in 68 sample areas in 42 states until 1954. 
About 22,000 interviews were made. In 1954 the sample was changed to in- 
clude 230 sample areas in 453 counties but the number of interviews remained 
approximately the same. In 1957 the sample was again redesigned and in- 
creased to include 330 sample areas in 638 counties in all 48 states and the Dis- 
trict of Columbia with about 35,000 households interviewed. This increase in 
the size of the sample greatly lowered sampling variability; for population bases 
of 500,000 the standard error was approximately halved and for smaller bases 
the improvement was, of course, even greater. 

In general, the classification of migrants and nonmigrants has followed the 
practises of the 1940 and 1950 censuses. A major change has been made, how- 
ever, in the treatment of children born during the period covered by the survey. 
Prior to 1947, the classification of migrants followed the procedure of the 1940 
census and included all children born during the period of the survey in the 
nonmobile group. For the surveys taken in 1947 and 1948 children born during 
the period were classified as migrant or nonmigrant, depending upon the usual 
residence of the mother at the time of the birth. For surveys taken in 1949 and 
later, children born during the period of the survey were excluded from both 
categories. The survey covering the period, August 1945 to October 1946, dif- 
fered from the others in that migrants were defined as persons who, as civilians, 
changed their county of residence at any time during the period. 

In the study of migration one of the most important gains from the Current 
Population Survey is the information on the mobility of the population for 
short periods between censuses. Other than the Department of Agriculture 
estimates of farm-nonfarm movement, these are the only data in which short 
cycles can be observed and in which the turning points of long cycles can be 
found. Also, of great importance, is the use of better selected and trained inter- 
viewers in the Current Population Survey than in the Census. Furthermore, 
these interviewers can ask more questions, take more time to get good answers, 
and probe if necessary. These advantages are so great as compared with the 
hurried and nearly untrained census enumerators that. most census officials are 
inclined to consider the Current Population Survey more reliable than the 
census in regard to population mobility. 

Knowledge of the characteristics of migrants has been greatly extended by 
the Current Population Survey because the questions asked are varied from 
survey to survey and now cover a large number of characteristics. These are 

now so great that they cannot be detailed here, but are listed in Table 5 which 
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TABLE 5. MIGRATION STATUS DATA IN CURRENT POPULATION REPORTS 
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Period covered by migration question 


1941- 


1940- 


x xX 


Age 
Marital status 
fear of marriage 
or duration of 
marital status 
Family and house- 
hold status 
Size of family 
Number ofchildren} 
Veterans status 
Employment (or 
labor force) 
status x 
Work status 
Weeks of 
unemployment 
Number of differ- 
ent jobs held 
Major occupation 


Number of moves 


x x 

x 

x x 

x x 

x x 

x x 

x 

x 

x x 
x 


1945-| 1940-| 1947-| 1948-| 1949-| 1950-| 1951-| 1952-| 1953-| 1954-| 1955-| 1956-| 1957- 
1949 | 1950 | 1951 | 1952 | 1953 | 1954 | 1955 | 1956 | 1957 | 1958 
x ME MEMES 
| x x 
Re ae ae ar 
| x 
| x 
| x 
x 
| x 


Sources: 1941-1945 
1940-1946 
1945-1946 
1940-1947 
1947-1948 
1948-1949 
1949-1950 
1950-1951 
1951-1952 
1952-1953 
1953-1954 
1954-1955 
1955-1956 
1956-1957 
1957-1958 


P-S No. 5; P-S No. 6; and P-S No. 8 

P-S No, 11; P-S No. 14 

P-20 No. 
P-20 No. 
P-20 No. 
P-20 No. 
P-20 No. 
P-20 No. 
P-20 No. 
P-20 No. 
P-20 No. 
P-20 No. 
P-20 No. 
P-20 No. 
P-20 No. 


4; P-S No. 14; P-S No. 24 
14; P-20 No, 17; P-60 No. 5 
22; P-50 No. 10 

28; P-50 No. 20 

36 

39 

47 

49 

57 

61; P-20 No. 67 

73 

82 
85 


1945 | 1946 

Region of 

residence x 
Urban-rual | 

residence 

Farm-nonfarm 

movement x 
Color 
Sex 

| 

group | 
Industry group x } ’ 
Class of worker x | \ 
Income 
Tenure and type 

housing | 
Years of school 

completed 
School enrollment 
Reason for last | 

civilian move 
Year of last move | 
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also shows the year in which particular characteristics were obtained. Only age 
and sex have been tabulated for each of the annual surveys. 

The great disadvantage of the Current Population Survey is that the sample 
is so small that data cannot be published for states or even for divisions. The 
only subdivisions of the United States for which data have been shown, there- 
fore, are regions, but there have been classifications of region of residence at 
beginning of the period with region of residence at its end. For the United 
States as a whole, data have been given for urban, rural nonfarm, and rural 
farm areas, and for standard metropolitan and nonmetropolitan areas. Within 
standard metropolitan areas, a division has been made between the central city 
and the surrounding urban and rural areas, permitting an analysis of relative 
growth of central and suburban areas. 


V. DEPARTMENT OF AGRICULTURE ESTIMATES OF 
MOVEMENTS TO AND FROM FARMS 


The longest annual migration series is that of movements to and from farms, 
prepared within the Department of Agriculture by the Bureau of Agricultural 
Economics and its successor organization, the Agricultural Marketing Service. 
Dating back to 1920, when farm residence was first used in the census as a 
residence classification, this series has continued to the present. Estimates are 
prepared for each state of movements to farms in the state from other farms, 
movements from farms in the state to other farms, movements to farms in the 
state from cities, towns, and villages, movements from farms in the state to 
cities, towns, and villages; and in recent years estimates have been made of 
movements between farms and the armed forces. State estimates are not pub- 
lished but are combined into totals for the nine census divisions. 

Over so long a period the estimates could not have been completely consistent. 
In the earlier years they were made by the Department of Agriculture alone 
and with sampling and survey techniques that are much inferior to those now 
in use. Since 1944, however, the construction of these estimates has been the 
joint work of the Department of Agriculture and the Bureau of the Census, 
and they have been much improved through the use of the carefully planned 
Current Population Survey. 

From 1920 to 1944 the estimates were prepared as follows. Mail question- 
naires were sent from the office of the agricultural statistician in each state to 
farmers known as crop reporters. Each was asked to report on the number of 
persons on his farm and on adjoining farms at the beginning and end of the 
year, and on the number of births, deaths, moves to these farms, and moves 
away from these farms during the year. For the population covered by the re- 
turned questionnaires ratios of births, deaths, movements to farms, and move- 
ment from farms were computed using the sample population at the beginning 
of the year as a base. The derived ratios were then applied to the total farm 
population of the state at the beginning of the year to obtain estimates of births, 
deaths, movements to farms, and movements from farms during the year. 
Births and in-movements were then added to the initial population and deaths 
and out-movements were subtracted to obtain the estimated population at the 
end of the year. Since errors were cumulative in this procedure the intercensal 
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estimates were revised whenever new farm populations could be obtained from 
population or agricultural censuses. 

In revising the 1930-1940 estimates it became obvious that farmers had 
tended to overreport movements to farms and underreport movements away 
from farms, and it was also found that unpublished estimates of farm popula- 
tion from the Monthly Report on the Labor Force differed considerably from 
those currently obtained by the Department of Agriculture. As a result a 
cooperative series on farm population, based upon the Current Population 
Survey, was initiated by the Bureau of Agricultural Economics and the Bureau 
of the Census in 1944. Quarterly reports were issued until April 1949 and annual 
reports thereafter. In recent years these reports have included estimates of 
the number of migrants and movers living on farms at the beginning of the 
year by farm or nonfarm residence at the end of the year, and vice versa. 

The Current Population Survey, however, gives the farm population and 
farm-nonfarm migrants only for the United States as a whole. Information 
on the geographic distribution of the farm population and on the factors of 
change within regions and divisions continues to be based upon mail question- 
naires to crop reporters. At present ratios are prepared for each state of move- 
ments between farms, movements between farms and the Armed Forces, and 
movements between farms and cities, towns, and villages. These are applied 
to the estimated farm population at the beginning of the year to obtain pre- 
liminary estimates of the various types of change during the year. A series of 
adjustments is then made. 

(1) Estimates of movements from farms to other farms are adjusted to con- 
form with the estimates to farms from other farms so that a zero balance is ob- 
tained for the United States as a whole. This procedure is justified by evidence 
that crop reporters’ statements of movements to adjoining farms are more re- 
liable than their statements of movements away from such farms. International 
movements are assumed to be negligible. 

(2) Where marked discrepancies exist, movements between farms and the 
Armed Forces are adjusted to data from the Department of Defense and the 
Selective Service System. 

(3) Movements between farms and cities, towns, and villages are adjusted 
to accord with the yearly changes in farm population due to migration or 
moving as indicated by the Current Population Survey for the United States 
as a whole. 

(4) An internal balance is struck between natural increase, movement to and 
from the armed forces, and the various types of changes of residence. 

These estimates of movement to and from farms are invaluable in the study 
of social and economic history and in the interpretation of current trends. In 
using them, however, the following limitations should be kept in mind. 

(1) Migration, when defined as movement to or from farms, is not the same 
thing as the migration shown by the other series discussed in this paper. In the 
other series migration is defined as a change of residence across state or county 
lines, but in this series it is defined as movement to or from a farm, regardless 
of distance and regardless of whether a political boundary has been crossed. 

(2) Though it has not been completely constant, the definition of a farm 
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corresponds in general to that used in censuses and the population surveys and 
therefore differs from that used in agricultural censuses. In censuses and popu- 
lation surveys the determination of whether the place of residence is on a farm 
is generally left to the respondent. Mail questionnaires are sent to known farm- 
ers and they consider the neighbors for whom they report to be farmers. In 
agricultural censuses, where the farm population is partly determined by con- 
siderations of type and volume of production, a different farm population is 
enumerated. In 1950, for example, 7.5 per cent of the people classified as farm 
residents by the census of population lived on places that were not considered 
farms in the agricultural census, and, conversely, 5 per cent of the persons in 
farm operator households, as defined by the agricultural census, were classified 
as nonfarm residents by the census of population." 

(3) Sampling variation and response problems are such that little weight 
should be attached to small changes in the yearly estimates. A first considera- 
tion is the sampling error in the Current Population Survey which, as of 1957, 
included 3,800 farm households among the 35,000 in the sample. More serious 
are the deficiencies of mail surveys. The survey of crop reporters has been im- 
proved and is considered “adequate size” in most states but the crop reporters 
are known to be “not a thoroughly representative group of farmers for farm 
population estimating purposes.” Of the approximately 85,000 questionnaires 
sent out in each of the last few years only 20 to 25 per cent have been returned. 


VI. MIGRATION ESTIMATES BY RESIDUAL METHODS 


In the absence of migration data, such as that discussed above, estimates of 
net migration can be obtained by subtracting natural increase from total in- 
crease or, for the segment of the population born before the beginning of the 
period in question, by adding deaths to total increase. Residual methods are 
widely used and, in fact, account for some of the most useful information we 
have on historical migration between states. The most comprehensive of such 
estimates are those of the University of Pennsylvania Studies of Population 
Redistribution and Economie Growth.'* For intercensal periods, 1870-1880 to 
1940-1950, net migration was estimated by color or race, nativity of whites, 
sex, and five-year age groups for each of the 48 states and the District of Co- 
lumbia. Also noteworthy are the Department of Agriculture estimates of net 
migration for rural farm areas by counties for the United States in 1930-1940 
and 1940—1950."" 

The techniques have been discussed in a number of publications and need 
only be briefly summarized here. Of the several methods of making residual 
estimates only one, the vital statistics method in which deaths are added and 
births subtracted from the total increase, is potentially exact. For most states, 


“4 Margaret Jarman Hagood and Gladys K. Bowles in Major Statistical Series of the U. S. Department of Agri- 
culture: How They Are Constructed and Used, Volume 7, Farm Population, Employment, and Levels of Living 
(Washington: Agricultural Marketing Service, Agricultural Handbook No. 118), p. 6. 

Idem. 

1% For a complete list of these, see Everett 8. Lee and others, op. cit. 

i? Eleanor H. Bernert, Volume and Composition of Net Migration from the Rural-Farm Population, 1980-40, 
for the United States, Major Geographic Divisions and States, (Bureau of Agricultural Economics, January, 1944) and 
Gladys K. Bowles, Farm Population: Net Migration from the Rural-Farm Population, 1940-60, (Agricultural Mar- 
keting Service, 1956). 
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however, adequate registration of births and deaths is either recent or has not 
yet been achieved. The death registration area was not formed until 1900 and 
the birth registration area until 1915. Each originally consisted of ten states 
and the District of Columbia, and the forty-eighth state, Texas, was not ad- 
mitted to either until 1933. It is probably true that the registration of deaths is 
nearly complete in all states, but the registration of births is seriously defective 
in a number of states. 

When age specific estimates of net migration are required, major difficulties 
arise in the vital statistics method in the allocation of deaths, which are usually 
tabulated for five-year age groups and calendar years, to cohorts which age dur- 
ing the period over which estimates are made. For intercensal estimates, for 
example, it is necessary for the census years to estimate the number of deaths 
occurring before and after the census, and for each year deaths by single year 
of age must be estimated from the five-year tabulations. This is usually done by 
assuming an equal distribution of deaths throughout the year and an equal 
number of deaths at each single year of age in the five-year age group. Not only 
are these assumptions obviously untrue but the arithmetic involved is con- 
siderable. 

An alternative method, but one which can be used only for the cohorts born 
before the migration period began, is to estimate the effect of mortality by ap- 
plying survival ratios, the complement of mortality ratios, to the number in 
each cohort at the beginning of the period. By subtracting the estimated sur- 
vivors from the number enumerated at the end of the period an estimate of net 
migration is obtained. Regardless of the accuracy of the survival ratios and 
the population counts, however, there is an inevitable margin of error in such 
estimates. 

The error in the net migration estimate is, of course, equal to the error in the 
imputed deaths. The population at the beginning of a period is made up of two 
segments, one which out-migrates during the period and one which does not. 
When survival ratios are applied to the whole population the deaths which are 
computed are those which should have occurred not only to nonmigrants, but 
also those which should have occurred to out-migrants after leaving the area. 
At the same time no allowance is made for deaths of in-migrants during the 
period. Net migration is therefore overstated by deaths to out-migrants in 
other areas and understated by deaths to in-migrants. In states where there has 
been heavy net in-migration or heavy net out-migration the error may be con- 
siderable. 

The method described above, the forward survival method, is one of several 
which may be used. For example, one may work backward from the population 
at the end of the period by applying “reverse survival ratios,” or the forward 
and reverse survival methods may be combined and “average survival ratios” 
used. Each of these is advantageous under certain circumstances but, regard- 
less of which is used, the failure to exactly account for deaths leads to errors in 
the estimates of net migration. 

The survival ratios which have generally been used are of two sorts, life 
table and census survival ratios. The former are computed from the L, column 
(“stationary population”) of appropriate life tables. Only for recent periods, 
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however, are life tables available for the United States or for an appreciable 
number of states. The first life tables for the registration area, 1900-1902 and 
1909-1911 covered only ten states and the District of Columbia, and the 
“national” life tables of 1919-1921 were based upon the mortality experience 
of 34 states and the District of Columbia. The 1939-1941 tables were the first 
which were available for all 48 states and the first in which deaths were allo- 
cated to place of residence. 

Even where the desired life tables are available, the use of life table survival 
ratios is often unsatisfactory unless the populations to which they are applied 
are laboriously smoothed or adjusted. Life tables are based upon population 
and mortality data which have been elaborately adjusted, while the popula- 
tions given in censuses and to which they are applied are distorted by under- 
enumeration or overenumeration and misreporting, particularly of age. To 
some extent census survival ratios incorporate adjustments for the inaccuracies 
of census data and for that and other reasons have been used in the University 
of Pennsylvania and Department of Agriculture estimates mentioned above. 

Census survival ratios are simply the ratios of the numbers in given cohorts 
at one census to the corresponding numbers at the previous census. The as- 
sumption upon which this calculation is made is that the populations are 
“closed,” that is unaffected by migration. For the United States as a whole this 
is nearly true for native whites and for Negroes or nonwhites. Such ratios re- 
flect not only mortality but also the degree to which the actual populations are 
enumerated, age group by age group. 

As computed for the United States and applied to states or counties, two 
additional assumptions are involved: (1) that the proportion which the enu- 
merated populations in each age-sex group bears to the actual population is the 
same at each census for each state as for the nation; and (2) that specific mor- 
tality rates are the same for each state as for the nation. But, were the United 
States populations completely closed and these assumptions met, there would 
still be the type of error described above as inherent in all types of survival 
methods of estimating net migration. 

Obviously the range of error in such estimates of net migration is large. On 
the basis of such assumptions as a normal distribution of the proportions of 
state populations enumerated, a value of the true survival ratio between .85 
and .95, and a ratio between the true and census survival ratio of between .9 
and 1.1, Price has estimated that about a third of the errors in net migration 
estimates would be as great as 2.5 per cent of the state populations at the end 
of the census intervals, and as great as 25 per cent of the true net intercensal 
migration.'* 

Whether or not the actual errors of such magnitude are of this frequency, it 
is certain that residual estimates of net migration from American census data 
will be in error to a considerable degree. Where heavy in-migration or out- 
migration is indicated, however, the exact amount may be questioned but there 
will seldom be reason to question whether net in-migration or net out-migration 
occurred or that the amount was considerable. On the other hand, estimates of 


18 Daniel O. Price, “Examination of Two Sources of Error in the Estimation of Net Internal Migration,” 
Journal of the American Statistical Association 50: 689-700, September, 1955. 
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small gains or losses should be regarded more as evidence that the net change 
through migration was not important rather than as certain indicators of 
either magnitude or direction. 
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BIVARIATE EXPONENTIAL DISTRIBUTIONS 


E. J. GumBEL 
Columbia University* 


A bivariate distribution is not determined by the knowledge of the 
margins. Two bivariate distributions with exponential margins are 
analyzed and another is briefly mentioned. 

In the first distribution (2.1) the conditional expectation of one vari- 
able decreases to zero with increasing values of the other one. The 
coefficient of correlation is never positive and lies in the interval 
—.40 <p <0, and the correlation ratio varies from —.48 to zero. 

In the second distribution (3.4) the conditional expectation of one 
variable increases or decreases with increasing values of the other 
variable depending on the sign of the correlation. The coefficient of 
correlation lies in the interval —.25<p<.25, and the correlation ratio 
is proportional to the coefficient. 


HE exponential distribution which is analytically very simple plays a 
protic role in physies since it governs radioactive decay [7]. It holds 
for distances in time, especially between the happening of rare events [1]. In 
recent years it has served as a first approach to a model for life testing. Finally 
it can be used as the starting point for the theory of extreme values [5]. It may 
therefore be of interest to study bivariate distributions where the marginal dis- 
tributions are exponential. 

Most thinking on bivariate distributions is centered about the normal case, 
which has been studied intensively since the times of Bravais and Karl Pearson. 
It was believed that its well-known properties may serve as a general model L. 
The curves of equal probability density are concentric ellipses. The regression 
curves are straight lines which intersect at the expectations. With increasing 
values of one variable the conditional expectation of the other one increases 
(or decreases) without limit. Finally, the coefficient of correlation varies from 
—1 to +1. However, in a previous publication [4] bivariate distributions with 
marginal normal distributions were constructed where these properties do not 
hold. 

The marginal distributions do not determine the corresponding bivariate dis- 
tribution. On the contrary, M. Fréchet [2] has proven that for given marginal 
distributions there exist infinitely many bivariate distributions with these 
margins. Fréchet’s result concerns the existence of these bivariate distributions 
and does not involve their construction; thus it is of interest to examine specific 
analytical forms of bivariate distributions. 

In the following, two bivariate. distributions with exponential margins will 
be studied. Again, none of the properties of the normal distributions are valid: 
the curves of equal probability density are not ellipses, the regression curves 
are not straight lines and do not intersect at the common mean. With increasing 
values of one variable the conditional expectation of the other remains within 
finite limits. Finally, the coefficient of correlation varies in a narrower domain. 


* Work done in part under a grant from the Higgins Foundation. 
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1. BIVARIATE DISTRIBUTIONS 


Let F;(x) and F.(y), fi(z) and fe(y) be the probability and density functions 
of continuous random variables X and Y. Then a bivariate probability function 
F(z, y) with these marginal distributions is monotonically increasing from zero 
to unity and is subject to the following conditions: 


a) F(—~, y)=F(z, —~)=0 
F(z, ©) = Fiz); F(~,y)= Fry); = 1 (1.1) 
b) The probability content of every rectangle is nonnegative, that is, for 
every <2X2, Yi <Y2 
Prob{z, < X Y< yo} (1.2) 
= F(%2, y2) — ys) — y2) + F(x, yr) 2 0. 
If the second cross partial derivative 0?F/dxdy exists everywhere, the bi- 


variate distribution has a density f(x, y) equal to this derivative and the con- 
dition (1.2) is then equivalent to 


(1.3) 


The variables are independent if and only if 
F(z, y) = Fi(a)F2(y). (1.4) 


More generally, the marginal density functions f,(z) and f2(x) are related to the 
bivariate density function f(z, y) by 


f ” fle, way = fila); f fiz, = fold). (1.5) 


The study of the conditional densities 


leads to the conditional expectations F (a| y) and E(y| x) and to the expectation 
of the cross product 


f y)fely)dy. (1.7) 


These values lead to the classical coefficient of correlation 


E — E(x)E 
(xy) (x) (1.8) 


to the squared correlation ratio 


1 
rely =— f [E(x) — (1.9) 


and the corresponding expression n*(y| x). The correlation ratios measure the 
departure from the regression lines. These well known methods will now be 
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applied to two new distributions and the results will be compared to the cor- 
responding properties of the usual bivariate normal distribution. 
2. A BIVARIATE DISTRIBUTION WITH EXPONENTIAL MARGINS 


Consider the bivariate function 


The marginal probabilities are exponential. The case 6=0 leads from (1.4) to 
independence. The boundary values of (2.1) are 


F(z, 0) = F(O, y) = F(O, 0) = 0; F(«, ©) = 1, 
We are going to prove now that the parameter 5 must satisfy the inequalities 
O<6<1. (2.2) 

From (1.2), the density function f(x, y) is 

f(a, y) = + bx)(1 + dy) — 5] (2.3) 
with 

=f@, <)=0; f0,0)=1-4. 
From the last equation and from the nonnegativity of a density function, it 


follows that 6<1. 
From the inequality which is true for all bivariate distributions 


F(z, y) < F(z) 
and from (2.1) it follows after a short simplification that 
—x(1 + dy) <0 (2.4) 


for all z and y. Since these are always nonnegative it follows that 6 cannot be 
negative. Therefore 6>0. 

With the restriction (2.3) the conditions (1.1) and (1.3) are fulfilled. The 
relation (1.5) is immediately verified. Therefore (2.1) constitutes a bivariate 
distribution with exponential margins. 

Evidently the curves of equal probability density are not ellipses but trans- 
cendental functions. Since the probability function (2.1) depends in the same 
way on z and y it is sufficient to analyze one variable, say z. 

The conditional density function f(x| y) is from (1.6) and (2.3) 


y) = + + By) — 3]. (2.5) 


The boundary density function f(x| 0) which is one of the conditional density 
functions and should not be confused with the marginal density is 


f(x| 0) = e*(1 — 6 + 82). 


It has a mode at #=2—1/8, provided that 6>3. Otherwise the density de- 
creases with z. 
The conditional expectation is 


E(z| y) = 


1+6+ by 
(1 + dy)? 


(2.6) 
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This regression curve is traced in Graph 1 for the two limiting cases 6=0 and 
6=1. The conditional expectation (2.6) diminishes for increasing values of y 
from 1+6 valid for y=0 to zero, while the variable y is unlimited to the right. 
The two regression curves do not intersect at the common mean E(x) = E(y) = 1, 
except in the trivial case 6=0. 
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The conditional second moment is, from (2.5) 
46 
+ 
(1 + dy)? (1 + dy)’ 


Subtraction of E?(2|y) leads from (2.6) to the conditional variance o(x|y) of 
x as a function of y. 


E(a?| y) = 


1 + 26 6? 
(1+ dy)? (1+ dy)* (1 + 


which, of course, is equal to unity for 6=0, i.e., the case of independence. In 


(2.7) 


o*%(x| y) = 
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the opposite case 6=1, the expression becomes 
2+ 4y + 
(ty) 
The conditional standard deviation of . as a function of y, 
o(x|y) = [(1 + y)? +1 + + (2.9) 


decreases for = 1 with increasing values of y (see Graph 1). The squared con- 
ditional coefficient of variation obtained from (2.6) and (2.7), 


(1 + dy)? + 26(1 + dy) — 
Ex(x| y) (1 + dy)? + 26(1 + dy) + 8 


o%(x|y) = 


converges, for increasing values of y to unity, which is also its value for the 
marginal distributions. In this sense 


a(x | y) ~ E(x| y) (2.10) 
independent of the value of 6. The cross product is from (2.6) and (1.7) 


1+6+8y 
(zy) 


=— +f e“d 
0 


o 1+ éy (1 + dy)? 


The integrals may be written, after the appropriate additions and subtractions 


1 1 2 1 1 
6 6 o 1+ 5 J 1 + dy 


Partial integration of the last integral leads after the transformation 


l/it+y=z 


1/6 


E(ay) = — — (2.11) 


where Ei stands for the integral logarithm. Since the means and standard 
deviations of the marginal distributions are equal to unity, the coefficient of 
correlation p is a function of the parameter 6 namely 

ells 


Ei(—6—") — 1. (2.12) 


From (2.11) the coefficient of correlation is zero for 6=0. It decreases for increas- 
ing values of 6 up to p= —.40365, valid for 6=1 as shown in the upper part of 
Graph 1. The correlation is never positive. 

Since the regression curves (2.6) are not straight lines, the correlation ratio 
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does not measure the departure from the regression lines. In the present case 
its value is 


n(x | y) — 


From (2.6), 
m(x| y) = 1 — (1) + (1 — 26)J (2) + 26/(8) + (2.18) 
where 
J(k) (2.14) 
o (1 + dy)* 
For decreasing values of k the integrals are linked by the recurrence formula 
1 Jtk — 1) 
(k—-1)8 (k—1)8 
The ics wae of the values for k= 2, 3, 4, into (2.13) yields 
n(x | y) = 
6 66 66 863 
On the other hand, the combination of (2.11), (2.12) and (2.13) leads to 


p=J(1)-1 


J(k) = 


whence 


=. 
3 6 66 
For 6=1 we obtain 
n®(x| y) = .23394. 


For 6=0 , the last factor in (2.15) becomes indeterminate. To obtain its value 


we expand 
p--1+f 
o 1+ 


in increasing powers of dy. This yields 


v=1 


whence, for 6=0 
6 
From (2.15) it follows that n(x| y) =0 for 6=0, as it should be. For increasing 


values of 6 the correlation ratio — +~/7?(x|y) decreases from zero to —.4837, as 
shown in the upper part of Graph 1. 
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3. A SECOND BIVARIATE DISTRIBUTION WITH EXPONENTIAL MARGINS 


In previous articles |4, 6], it was shown that given two probability functions 
F(x) and F.(y) a bivariate function F(z, y) can be constructed, by means of 
the equation 


P(x, y) = Fi(x)F2(y)[1 + afl — Fi(x)} {1 — (3.1) 
where 
(3.2) 


The bivariate density function is given by 

S(x,y) = + a(2F (x) — 1)(2F — 1). (3.3) 
The fact that conditions (1.1) and (1.3) hold, was shown in the publications 
mentiened. 


Let F(x) and F:(y) be exponential functions. Then the bivariate probability 
and density functions become 


2 2>0;y>0 

f(x, y) = + a(2e* — 1)(2e-¥ — 1)]. 
Since these functions depend in the same way on x and y it is sufficient to study 
one variable, say x. The marginal density function obtained from (1.5) is, of 


course, the exponential function itself. 
The conditional density function f(z| y) is, from (1.6) and (3.4) 


f(x| y) = e*(1 + — — — (3.5) 


(3.4) 


It follows that the boundary density function 
0) = e-*(1 — a) + 2ae-* 


diminishes with . if a>0. For a<0 it has a mode at 


The regression curve of « on y is obtained from (3.5) as 


E(e| y) = =14+— (3.6) 


The conditional expectation increases for a>0O (decreases for a<0) with in- 
creasing values of y from 1—a/2 to 1+a/2. Thus it remains finite. The regres- 
sion curves are not linear but exponential functions traced in Graph 2. 

For two values of a which differ only in sign the conditional expectations 
are related by 


3(E_o(x| y) + Ea(x| y)] = 1. (3.7) 


This constitutes a symmetry of the conditional expectation about unity. 
The conditional expectation F (x?| y) is easily obtained from (3.5) as 


| 
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CGraPH 2. Regression curves for bivariate exponential distributions (second type) 


E(x*| y) = 2+ 4a — 
In analogy to (3.7) we have 


y) + y)] = 2. 


The conditional variance o(z! y) of x as a function of y is from (3.6) 


o%(x| y) = 1+ (1 — 2e-¥) — (1 — 2e-v)2, (3.8) 


The conditional expectations, the squared expectations, and the variances are 
equal to the unconditional values 1, 2 and 1 at the medians 7=log 2. 


For positive (negative) values of a the variance increases (decreases) with 
increasing values of y from 


In particular, for a=1, it increases from 1/4 to 5/4 and decreases for a= —1 
from 5/4 to 1/4. Consequently the standard deviations vary between .5 and 


1.118 as shown in Graph 3. The relation (2.10) holds for this system only in 
the case a= —1, Then the expression 
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Grapa 3 


Ex(x|y) — 2e-v +e) 


(3.9) 


converges, with increasing y, towards unity, while for a=1, the expression con- 
verges to 5/9. 


The expectation of the cross product becomes from (1.7) and (3.6) 


a 
E(zy) = 


Therefore the coefficient of correlation is 


From (3.2) the correlation varies only within the narrow domain 
—.25 .25. (3.11) 


In contrast to the previous case it can also be negative. The correlation ratio 
n(z|y) becomes, from (3.6) 


(3.12) 


Thus the corresponding ratio 
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y) = 2p/V3 (3.13) 


is a multiple of the coefficient of correlation, and varies in the interval + .28867. 
A third bivariate distribution with exponential margins is given by 


F(z, y) = 1—e* —e*+ PG, y) (3.14) 


where 
y) = exp [—(@™ + (3.15) 


The case m=1 leads to independence. The density function 


f(a, y) = Plz, y)(a™ + + + m— 1] (3.16) 


is nonnegative only if m>1. 


CONCLUSION 


The fact that many of the properties of bivariate normal distribution do not 
hold here may serve as a warning against the indiscriminate use of normal cor- 
relation and regression analysis; prior investigation of the nature of the bi- 
variate distributions is necessary. ; 

If we know that the marginal distributions are exponential, then the dif- 
ferent behavior of the conditional distributions (2.5) or (3.5) and in particular, 
of the boundary distributions, of the regression curves (2.6) or (3.6), of the con- 
ditional standard deviations (2.7) or (3.8) and the different domains of the co- 
efficient of correlation and the correlation ratio may be used as criteria for the 
acceptance of one of the two systems for a given set of observations. However, 
it has to be realized that other bivariate distributions with exponential margins 
exist which are outside of the systems considered here. 
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ON THE EXACT VARIANCE OF PRODUCTS* 


Leo A. GoopMAN 
University of Chicago 


A simple exact formula for the variance of the product of two random 
variables, say, z and y, is given as a function of the means and central 
product-moments of x and y. The usual approximate variance formula 
for zy is compared with this exact formula; e.g., we note, in the special 
case where xz and y are independent, that the “variance” computed by 
the approximate formula is less than the exact variance, and that the 
accuracy of the approximation depends on the sum of the reciprocals 
of the squared coefficients of variation of z and y. The case where z 
and y need not be independent is also studied, and exact variance 
formulas are presented for several different “product estimates.” (The 
usefulness of exact formulas becomes apparent when the variances of 
these estimates are compared.) When z and y are independent, simple 
unbiased estimates of these exact variances are suggested; in the more 
general case, consistent estimates are presented. 


1, INTRODUCTION AND SUMMARY 


se usual formula, which has appeared in the statistical literature, for the 
variance of the product of two independent random variables is an approxi- 
mation (see, for example, Yates [3, p. 198]). In this literature, it has also been 
suggested that this approximate formula for the variance is satisfactory only if 
the coefficients of variation of the two random variables are both relatively 


small. In the present note, we shall present a simple exact formula for this 
variance, which does not depend on any assumptions concerning the magni- 
tudes of the coefficients of variation. The relative accuracy of the usual approxi- 
mate formula for this variance will be computed. It will be seen that the 
“variance” obtained using the approximate formula is less than the exact 
variance, and that the approximation may be satisfactory in some cases where 
one or both of the coefficients of variation are relatively small. A simple unbiased 
estimate of the exact variance will also be presented. The exact formula for the 
variance will then be used to compute the relative efficiency of two different 
kinds of estimates of a parameter that is in fact equal to the product of two 
other parameters. (The reader will note that the usual approximate variance 
formula could not be used to compute correctly the relative efficiency of these 
two estimates. In fact, if the approximate variance formula had been applied, 
rather than the exact formula, an erroneous conclusion regarding the relative 
efficiency of these estimates might have been obtained.) This exact formula will 
also be generalized to obtain an exact formula for the variance of the product 
of three (or more) independent random variables. Finally, the situation where 
the random variables need not be independent will be investigated, and exact 


* Research carried out at the Statistical Research Center, University of Chicago, under the sponsorship of the 
Statistics Branch, Office of Naval Research, and of the Social Science Research Committee, University of Chicago. 
Reproduction in whole or in part is permitted for any purpose of the United States Government. 

I am indebted to R. L. Ashenhurst, H. L. Jones and R. Summers for some helpful comments. Formula (18) was 
obtained independently by Professor Jones using a somewhat different method from that presented in the present 
note. 
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variance formulas will be presented in this case along with consistent estimates 
of these variances. 

The method of obtaining the exact variance formulas, which are presented 
here, is quite simple. This method can be generalized to obtain exact formulas 
for any of the central moments of the product of two or more random variables. 
These formulas could also be derived from general formulas relating product- 
moments about the origin to central product-moments (see, for example, 
Kendall and Stuart [1, p. 82] and Tschuprow [2]), but an additional set of 
calculations would then be required in order to modify the formulas obtained 
for the product-moments about the origin so that relatively simple formulas 
could be obtained for the central moments of the product. The present paper 
presents a more direct method of proof. To the best of my knowledge, the simple 
exact formulas presented in this paper are new. (It seems surprising that they 
should not have appeared in print before this.) 

There are many situations where the variance of the product of two random 
variables is of interest (e.g., where an estimate is computed as a product of two 
other estimates), so that it will not be necessary to describe these situations in 
any detail in the present note. 


2. THE CASE WHERE THE RANDOM VARIABLES ARE INDEPENDENT 


Let x and y be two independent random variables. Let us denote the ex- 
pected value of x by E(x) = X, the variance of x by V (a), and the square of the 
coefficient of variation of x by V(r)/X*=G(z). A similar notation will be used 
for the random variable y. (For the sake of simplicity, we shall assume here 
that E(x) =X and E(y)=Y differ from zero, although some of the results pre- 
sented here do not require this assumption.) Since 


ay — XY = XY[(6r + + — 1] = XV + + (1) 


where 6x = (x—X)/X and éy=(y— Y)/ Y, we have that the variance V (zy) of 
the product ry is equal to 


V(zy) = El (zy — XY)*} = (XY)*[GQ) + G@) + G@)G\y)] 
= X*V(y) + Y°V(x) + V(z)V(y). 


The usual approximate formula, which has appeared in the literature, is 


V(xy) = X*V(y) + Y°V(a) = (XY)?[G(y) + G@a)]. (3) 


(2) 


Thus, the relative inaccuracy of the approximation V (xy) is 


R = [V(cy) — (ay) = + GY) + E@E)) 
= 1/[A + 1], 


where A =G~!(x)+G~-"(y). From (4) we see that, if either G(x) or G(y) is quite 
small, then A will be relatively large, and the relative inaccuracy will be small. 
Thus, the approximate formula may be satisfactory even in some cases where 
only one of the two coefficients is small, contrary to what has usually been sug- 
gested in the literature. 


(4) 
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We shall now present an unbiased estimate of the variance V(zy). Since 
E(x?) —-X?=V(za), we have that 


v(zy) = [x? — v(x) Jo(y) + [y? — vy) Jo(z) + 
= + y*v(x) — v(x)o(y) 


is an unbiased estimate of V(xy), where v(x) is an unbiased estimate of V (x) 
and v(y) is an unbiased estimate of V(y). It is interesting to note that, while 
V(x) V (y) is added to the usual approximate formula V (ry) = X?V(y) + Y?V (2) 
to obtain the exact formula for V(xy), the quantity v(x)v(y) is subtracted from 
the estimate i(xy)=2x*v(y)+y*%x(x) to obtain the unbiased estimate v(ry) 
of V(zy). 

Let us now consider the situation where a sample of x’s and an independent 
sample of y’s are obtained, and where the sample sizes are n(x) and n(y), re- 
spectively. Let @ and 9 be the sample means of the z’s and y’s, respectively, 
and let s*(x) and s?(y) be the usual unbiased estimates of V(x) and V(y), re- 
spectively. Then #9 will be an unbiased estimate of X Y whose variance is 


V (59) = X°V(5) + Y°V(#) + 


(5) 


(6) 
n(y) n(x)n(y) 
An unbiased estimate of V (#9) will be 
v(#9) = + — v(#)0(59) 
sy) 8*(y) 8*(x)s*(y) 
= (7) 
n(y) n(x)  n(x)n(y) 
When n(z) =n(y) =n, the variance V (#9) becomes simply 
V(#5) = [X°V(y) + Y¥°V(z) + (8) 
and the unbiased estimate of V (#9) becomes 
v(#9) = + — s*(x)s*(y)/n]/n. (9) 
If a sample of n paired observations (2;, y;) is obtained (¢=1, 2,---, n), 
then 
= 


i=1 


is also an unbiased estimate of X Y in the special case where x and y are inde- 
pendent random variables. In this case, the variance of z is 


V(z) = V(ay)/n = [X°V(y) + + (10) 


and the relative efficiency of the estimate z as compared with the estimate #7 is 


V(#5)/V(2) = t 
+ 


(11) 
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which approaches 
V(ay)/V(ay) = + + Ga) + GY)E@] (12) 


as n— 2. Thus, the estimate z is less efficient than the estimate §# in this case, 
and the relative decrease in the variance of #j as compared with the variance 
of z approaches (n— ~ ) 


1 — [P(ay)/V(zy)] = R = 1/[A + 1], (13) 


where A =G-!(x)+G-"(y) as earlier herein. Thus, we see that, when x and y 
are independent, the effect of using the usual approximate formula V (zy) rather 
than the exact formula V (zy) is comparable to the effect of using the statistic 
z rather than #7 as an estimate of the product XY of the parameters X and Y 
(when 

The preceding results can be generalized to obtain exact formulas in the 
situation where the product of three (or more) independent random variables 
is of interest. For example, let the three random variables be z, y, and z, where 
X, Y, and Z are their respective means, V(x), V(y), V(z) are their respective 
variances, and G(x), G(y), G(z) are their respective squared coefficients of 
variation. Since 


ayz — XYZ = XYZ[(6r + 1)(6y + + 1) 1] 
= XYZ|ox + by + bz + brby + + + 


where the 6’s are defined as earlier herein, we have that the variance V (xyz) 
of the product xyz is equal to 


V(zyz) = (xyz — XYZ)?*} 
= (XYZ)2[G(x) + Gy) + G(z) + G(a)G(y) + G(x)G(z) + G(y)G(z) (15) 
+ G(x)G(y)G(e)]. 


(14) 


An approximate formula for this variance (comparable to the usual approxi- 
mate formula for the variance of the product of two independent random vari- 
ables) would be 


V(ayz) = (XYZ)*[G(z) + Gy) + G@)], (16) 
which we now see will be satisfactory only if the term 
(X YZ)?[G(z)G(y) + G(a)G(z) + G(y)G(2) + 
can be neglected. Formula (15) is a generalization of (2), while formula (16) is 
a generalization of (3). 
3. THE CASE WHERE THE RANDOM VARIABLES NEED NOT BE INDEPENDENT 


Let z and y be two random variables (not necessarily independent). Let us 
denote the expected value of xy by E(ry)=M), and the covariance between 
éx and dy by Eféréy}=Dy. We also write and 
E{(Az)*(Ay)*} =E;;, where Ar=x—X and Ay=y—Y. Since 


zy My = XY[(x + 1)(y + 1) By, 


17 
= XY[éx + by + + Ful, 
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where By, = M,,/(XY) and Fy =1—By= we see that the variance |’ (xy) 
of the product ry is equal to 
V(ry) = (XY)2[@y) + Ga) + 2Di + + + Doe — 


18 
= X°V(y) + + + 2XE + 2VEn + Ex — (18) 


where E2,,— 2\,?= V(ArAy) is the variance of ArAy. The usual approximate for- 
mula for V (ay) is 


P(cy) = X°V(y) + ¥°V(z) + 2XYEu, (19) 


which we now see will be satisfactory only if the term 2N\.+2YFy+V (Ardy) 
can be neglected. 

If a sample of n paired observations (x,, y;) are obtained (¢=1, 2, n), 
then 


ryi/n 


is an unbiased estimate of M,, and #9 is a consistent estimate of XY. It is easy 
to see that the expected value of #9 is E{29}=XY(1—-1/n)+Mi/n 
= XY+E,,/n, so that the statistic (4jn —z)/(n—1) =w is an unbiased estimate 
of XY. The variance of z is 


n (20) 


= [X°V(y) + Y°V(2) + 2XNYEu + 2XNEw + 2V Ex + V(Ardy)|/n, 

while the variance of £9 is . 
= + + 2XYEy, + 2X +2Y =. 

n n si 


+ + Cov[(Ax)*, (Ay)*] 


n n? 


where Cov [(Ax)?, (Ay)?]=Ex—V(x)V(y) is the covariance between the 
random variables (4.r)? and (Ay)?. The mean squared error of #7 as an estimate 
of XY is 


MSE(%¥) = V(#5) + (Eux/n)?*. (22) 
Since the estimate w of XY is asymptotically equivalent to #), the variance 
V(w) of w can be simply approximated using the fact that the limiting value 


(n—«) of nV(w) is equal to the limiting value of nV(#j) and nMSE(#9). 
Thus, we have that 


V(w) [X°V(y) + + 2XYEu)/n = P(zy)/n. (23) 


Consistent estimates of V(z), V(#9), MSE(#§), and V(w) can be obtained by 
replacing the various population moments by the corresponding sample mo- 


V (zy) 
V(z) = — 
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ments (which will be consistent estimates of the corresponding population 
moments) in equations (20), (21), (22), and (23), respectively. 
The ratio of V(#j) to V(z) approaches 


V (xy) /V (xy) 


[X*V(y) + + 2XVEu + + 2YEn + V(Ardy)] 


asn—«. We also see that 7(ry)/V (zy) is the limiting value for (z) 
and V(w)/V(z). Thus, the effect of using the usual approximate formula V (zy) 
rather than the exact formula V(zy) is comparable to the relative difference 
between the variance of the estimate z of M,, and the variance of the estimate 
w (or #9) of XY (when n— ~~). 

The preceding results can be generalized to deal with the situation where the 
product of three (or more) random variables is of interest (where these random 
variables need not necessarily be independent). The method of obtaining these 
results can also be generalized in order to obtain exact formulas for any of the 
central moments of the product of two or more random variables. These results 
can also be directly generalized to deal with situations where the ratio of random 
variables or the product of powers of random variables are of interest. 


(24) 
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X*V(y) + Y°*V(a) + 2XYEy 
[X?V(y) (x) J 


ON CONDITIONAL EXPECTATIONS OF LOCATION STATISTICS 


Rosert V. Hoce 
State University of Iowa 

Let 7; and T; be two odd location statistics, such as the sample mean 
and the sample median. Let S,; and Sz be two even location-free statis- 
tics, such as the sample variance and the sample range. When sampling 
from a symmetric distribution, it is proved that the conditional expec- 
tation of either odd location statistic, given the even location-free 
statistics, is equal to 6, the midpoint of the distribution. This implies 
that if the random weights W; and W; are functions of even location- 
free statistics so that Wi+W:.=1, then W:7:+W2T? is an unbiased 
estimator of 0. 


ECENTLY, in this Journal, the author [1] proved that an odd location 
R statistic and an even location-free statistic are uncorrelated provided the 
random sample is taken from a symmetric distribution. However, under these 
same conditions, it can be shown that the mean regression function of an odd 
location statistics on an even location-free statistic is a constant function. Of 
course, this result implies that the correlation coefficient, if it exists, is equal to 
zero. More importantly, however, this result is used to show that certain 
randomly weighted means of unbiased statistics for a parameter are also un- 
biased statistics. We first state two definitions before considering the theorem 
and this application. 

Let X,, Xo, ---, X, be a random sample from a distribution. The statistic 
T(X,, X2, -- X,) is an odd location statistic if, for all real - ++, 
we have 


(a) +h) = T(x, +h, for every h, 


The statistic S(X,, Xs, - - - , X,) is an even location-free statistic if, for all real 
2, * La, We have 


(ce) +h) = S(x, ---, tn), for every h, 
(d) S(—z1, = S(t, - - Ze). 
In this terminology, we state the following theorem. 


THEOREM. Let Xi, Xs, - - -, X, be a random sample from a distribution 
that is symmetric about the point 6. If the mean of an odd location statistic 
T(X,, Xo, X,) exists and if S(X,, Xe, - - -, X,) is an even location-free 
statistic, then the conditional expectation of 7’, given S=s, is equal to the con- 
stant 


PROOF. First, we show that 
— exp(uS)] = 0 
for all real u. This implies that 
Es{ — 6)| S] exp(iuS)} = 0, 
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where E[(T—8) | S] is the conditional expectation of T—6, given S. The unique- 
ness of the Fourier transform then requires that 


E|(T — 6)|S] = 0 or E(T|S) =8, 


almost everywhere. For convenience, these steps are carried out in detail under 
the assumption that the symmetric distribution is of the continuous-type 
having probability density function f(x) with f(@—x) =f(@+2). 

Let us consider E[(7’'—@) exp (iuwS) |, which is given by 


Kw = f Ure, 
-exp[iwS (ai, tn) |f (a1) dz,. 


Under the transformation z;=y;+6, j=1, 2, ---, mn it is seen, upon using (a) 
and (c) of the definitions, that 


The further change of variables y;= —2;, j=1, 2,---, n, together with (b) 
and (d) of the definitions, makes it possible to write 


-exp[iwS(e1, - - — 21) - -f(0 — den. (2) 
Since f(@—z;) =f(@+2;), a comparison of (1) and (2) yields 
K(u) = — K(u) or K(u) = 0. 


If g(t, s) is the joint probability density function of T and S, K(u)=0 can 
be written 


J f — 6) exp(ius)g(t, s)dids = 0 


f ° f “Ct — 6) exp(ius)h(t| 8)g2(s)dids = 0, 


where h(t|s) is the conditional p.d.f. of T, given S, and go(s) is the marginal 
p.d.f. of S. Accordingly, by first integrating on ¢, we obtain 


6) | = 0. 


The uniqueness of the Fourier transform implies that 


— 6) | s]g2(s) = 0. 
Thus 
— 6)|s]) =0 or E(T|s) =8, 
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almost everywhere. This completes the proof of the theorem. 
The same argument used above shows that if k is an odd positive integer, then 


E((T — s] = 0, 


provided the kth moment of 7’ exists. Moreover, if S, and S: are two even loca- 
tion-free statistics, an argument similar to that given above shows that 


— @)| 8] =0 E(T'| 82) = 8, 


almost everywhere. Finally, it should be noted that the theorem is actually 
true for every random vector (X,, Xs, - +--+, X,) whose distribution of prob- 
ability is symmetric about the point (@, 6, - - - , 6) in n-dimensional space. That 
is, the assumptions that X,, ‘X2, ---, X, are stochastically independent and 
identically distributed are unnecessary. Thus we state, without formal proof, 
the following generalization of the first theorem. 

THEOREM. Let the random vector (X;, - , X,) have a distribution 
of probability that is symmetric about the point (@, 6, - - - , @) in n-dimensional 
space. If the kth moment of an odd location statistic T(X,, Xe, - - - , X,) exists 
and if S,(Xi, X2,---, X,) and S,(X,, X2,---, X,) are even location-free 
statistics, then 


E((T — 6)*| 81, 82] = 0, 


provided k is an odd positive integer. 

We now consider an application of this theorem. Let 7; and 7; be two 
stochastically independent unbiased statistics for the parameter @. Let o;? and 
a2" be respectively the variances of these statistics. It is well known that 


is the unbiased linear combination of 7, and 7, that has smallest variance. 
This minimum variance is equal to 


+ 


Of course, we need only the ratio o;?/o2?, rather than each of 0,7 and o,’, to 
define 7. However, in the absence of the value of this ratio, there is the question 
of how to weigh the statistics T; and 7; so as to obtain a good unbiased linear 
combination of these statistics. We might consider the mean (7,4-T2)/2 as a 
reasonable solution. However this estimator is extremely poor if the ratio 
o:7/(o,2+,") is close to zero or one. That is, the ratio of the variance of 
(T; + T2)/2 to the variance of T is quite large if o,2/(0,?+-0,*) is near zero or one. 

In a recent article, Graybill and Deal [2] suggest the use of random weights 
as there are no constant weights, under our conditions, that provide a very 
satisfactory solution. That is, they suggest that a new statistic 7’ be defined 
by replacing, in 7’, o,? and o,” by estimators, say V; and V2 respectively. Thus 


V 
Vitv 
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This raises the question whether this statistic 7” is unbiased. In the case con- 
sidered by Graybill and Deal, the four statistics T7;, T2, Vi, and V2 are mutually 
stochastically independent, and accordingly they are able to prove that T” is 
an unbiased statistic for 6. Their argument, however, does not require that 
these statistics be stochastically independent. A-sufficient requirement is that 


E(T;| 01,02) =0, =1,2. 


For if this is true, then 


v2 


and, consequently, 
= Ey,v,[E(1"| Vi, V2)] = Ev,v,(0) = @. 


If this result is used in connection with the theorem, we observe the following. 
If we are sampling from distributions that are symmetric about 6 and if 7, and 
T2 are odd location statistics and if S, and S; are even location-free statistics, 


is an unbiased statistic for 6 provided E(T;) and E(T;) exist. For example, 
let X,, Xs, -- +, X, be a random sample from a symmetric distribution about 
6 and let Yi, Ye, --+, Ym be a random sample from another symmetric dis- 
tribution about @. Then each of the statistics (i) and (ii) is unbiased (provided 
the necessary means exist) : 

V2X¥ + ViY (i) 

i 
VitV2 


where X and Y are the respective sample means and where 


(x, — X) - 
n(n — 1) m(m — 1) 
R.M,+ RiM, 
Ri+R, 


where M, and M; are the respective sample medians and R, and FR, are the 
respective sample ranges. In case (ii) it is important to note that R, and R, 
are not estimators of the variance; however, they are even location-free 
statistics. 


(ii) 
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A NEW BINOMIAL APPROXIMATION FOR USE IN SAMPLING 
FROM FINITE POPULATIONS 


Perer J. SANDIFORD 
Trans-Canada Air Lines 


A new way of approximating the hypergeometric distribution with a 
binomial is described which is obtained by equating the mean and 
variance of the binomial to those of the hypergeometric distribution. 
The paper gives several examples which show the approximation to 
be much more accurate than the usual binomial. 


HE need frequently arises to evaluate data obtained by sampling from a 

finite population. The precision of a proportion estimated in this way can 
be calculated exactly by using the hypergeometric distribution, but because of 
the intractable nature of this distribution, an approximation is usually 
necessary. 

In most cases [1], [4], [5, p. 108] a binomial approximation (which assumes 
the population is infinite) is used, particularly when the observed proportion 
is less than 0.1. In other cases the problem can be cast in the form of a 2X2 
contingency table [6, Sec. 21.02] and use made of Finney’s tables [8] or x? 
with one degree of freedom (with Yates correction for continuity) [6, Sec. 
21.01]. When the population is large, i.e., 500 or more, confidence limits can 
be obtained exactly from Chung and DeLury’s charts [2]. Graphs given by 
Coggins [3] may also be used after modification. Finally, an expansion of the 
cumulative hypergeometric distribution in terms of incomplete Beta functions 


has been given by Wise [10]. This expansion is mentioned again below. 
This note presents a new binomial approximation which gives a remarkably 
close fit to the exact hypergeometric distribution. It has the advantages 


1) it is discrete and needs no correction for continuity, 

2) it is skewed in the same direction and by roughly the same amount as 
the exact distribution, 

3) it is a better fit than the standard binomial approximation and always 
involves lower powers and factorials, 

4) it is generally easy to compute with the aid of existing binomial tables 
[9], [11]. 


In the general case a sample of n is drawn (without replacement) from a 
population size N of which X possess a certain attribute (e.g., are defective), 
so that the true proportion of individuals with the attribute is X/N or p. The 
binomial approximations to the true probabilities of obtaining 0, 1, 2, - - -z 

- - - individuals with the attribute in the sample are then usually obtained 
from the binomial distribution with probability p and exponent n, i.e., the 
binomial with the same mean value (np) as the underlying hypergeometric 
distribution. The approximation proposed here is obtained by equating both 
the mean and the variance of the hypergeometric distribution to those of the 
approximating binomial.* This binomial has exponent r and probability p* cal- 
culated from the relationships 


* c.f. Jones [7], who uses the binomial in a similar manner to approximate a more complex distribution. 
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* — 
Tp P N 


np(l—p)(N—n) nX(N — X)(N — 
N-1 — 1) 


rp*(1 — p*) = 
(r integral). 
The procedure is thus to choose r as the nearest integer to 
nX(N — 1)/[N(N — 1) — (N — X)(N — n)] 
and then to calculate 


The restriction to integral values of r is suggested for ease of using published 
tables, but it is not strictly necessary. Interpolation for fractional values of r 
improves the approximation slightly. 

In the limit N-«, Xx with X/N=p, the above equations yield r=n, 
p* =p. Thus the new and standard binomial approximations are equal in the 
limit and equal the limit of the hypergeometric distribution Feller [5, p. 47]. 

For reasons given below it is necessary that the “X” in these formulae is 
taken to be the number in the smaller of the two attribute classes, and the 
“sample” is designated as the smaller of the two portions into which the 
sampling has divided the population. 

It may be noted in passing that r is always less than or equal to the smaller 
of n and X and so not only are the actual calculations easier, but the approxi- 
mating binomial is more likely to be included in published tables [9], [11]. 

The first three moments of these distributions are compared in Table I. The 
second and third moments in the new approximation are only exact if r is not 
restricted to integral values. 

The relationships between the third moments are not simple, but it can be 
shown that for n and X >3N the third moments of the two approximations 
are in very poor agreement with that for the exact distribution, but for n and 
X <4N the yu; for the new binomial is reasonably close to that for the hyper- 


TABLE I. THE FIRST THREE MOMENTS OF THE DISTRIBUTIONS 
CONSIDERED 


Binomial 


Hypergeometric Usual 
New Approximation Binomial 


rp* =np np 
N-n 
N- 


N- 


rp*(1—p*) =npq- npq 
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geometric series, and is in any case closer to the true value than is the yu; of the 
usual binomial. The fact that the approximating binomial is not symmetrical 
when X = N/2 (i.e., p=4) does not affect the closeness of the approximation. 

The adequacy of this suggested approximation is shown in Tables II and III, 
which compare the true probabilities calculated from the hypergeometric 
formula with the usual binomial values and those obtained by the method sug- 
gested here. The Tables refer to a population size of 100 and to sample sizes of 
10 and 20, respectively. 

In both Tables the columns headed “Exact values” contain the true hyper- 
geometric probabilities; those headed “Usual binomial” contain the probabili- 
ties calculated from the binomial expansion with the same n and p as the hyper- 
geometric, and the “New binomial” columns show the values calculated by the 
method suggested in this paper. 

In all cases the approximation suggested here fits the exact probabilities more 
closely than does the usual binomial. The two values of n taken as examples 
are relatively small compared with N: as n/N increases the usual binomial 
deviates more and more from the exact probabilities, while the present approxi- 
mation remains reasonably close. 

One final point worth mentioning concerns the case in which X =}3N, p=}. 
In such cases the suggested type of binomial is not symmetrical but is never- 
theless a closer approximation than the usual method: the approximation can 
be improved still further by averaging the kth and (n—k)th terms of the new 
binomial and using the averages instead of the calculated values for kth and 
(n—k)th probabilities. 

The utility of the approximation is further shown in Table IV. The probabili- 
ties in this table have been chosen to correspond with those given in Table II of 
Wise’s paper [10], since the approximation given in that paper is the most 
recent attempt known, and consist in each case of the probability that in a 
sample of 80 from a population of 200 there will be 32 or less with the attribute 
considered. 

It is seen that the new binomial values are almost as precise as those ob- 
tained by Wise. Both of these approximations are much more precise than the 
usual binomial method. 

An alternative way in which this approximation can be used is in the solution 
of the inverse problem, i.e., the determination of the X which would yield, 


TABLE IV. A COMPARISON WITH WISE’S APPROXIMATION 


No. in Probability that sample has 32 or less with attribute 
population 
with Exact value pines ees: Usual 
attribute, | (hypergeometric Wise ~ New binomial binomial 
ie. X distribution) approximation approximation approximation 


-99623 - 99690 
70 -91308 -91289 -91490 -85403 
80 -55929 -55922 -55308 - 54836 
90 15494 -15518 - 15610 -21629 


.01677 .01713 


| 100 01504 04646 
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TABLE V. ESTIMATED VALUES OF X FOR GIVEN PROBABILITIES OF 
OBTAINING z LESS THAN 7 WHEN n=80, N =200 


Probability . -975 -950 -900 -750 -500 -250 -100 -050 


Exact X ‘ 9.145 9.993 | 11.138 | 13.366 | 16.318 | 19.784 | 23.323 | 25.622 


Wise’s value : 9.117 9.990 | 11.134 | 13.367 | 16.330 | 19.796 | 23.332 | 25.634 


New binomial value 8.25 9.20 10.05 | 11.20 | 13.35 | 16.25 | 19.60 | 23.25 | 25.75 


with a given probability, an observed value of less than x. This can be done by 
calculating a probability for the given values N, n and < for each of a series 
of values of X and then plotting these probabilities against the corresponding 
X’s. The value of X corresponding to any given probability can then be read 
off the graph. The results of this process for N = 200, n=80, and z less than 
7 are given in Table V. These values again correspond to examples quoted in 
Wise’s paper. 
Here again both approximations are close to the true value, and are both 
exact if the usual practice of rounding X to integral values is adopted. 
The author acknowledges witl thanks many stimulating discussions wiht 
Mr. H. J. G. Whitton. 
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+ Mr. Whitton has suggested on purely heuristic grounds that if it is desired to obtain confidence limits for X, 
(or p), a a sample value z, they might be obtained similarly from the ordinary binomial confidence limit 
grapbs 
These tables are entered with “n” & “C’/n” and yield values of “p”. To avoid confusion with the notation used in 
the paper these can be written #, C/f#, and p. The suggestion is to take # as the nearest integer to 
nz(N-1) 
n(N-1) — (n-x)(N-n) 


and C =z, 
The values of j read off the graphs can be converted to the required X's by using the formula 
Nap 
n 
A limited number of values X obtained in this way have been found to agree very well with the values calculated 
from the exact distribution. The investigation is proceeding, and will be published in a later communication. 


CIRCULAR ERROR PROBABILITIES 


H. Leon Harter 
Aeronautical Research Laboratories, Wright-Patterson Air Force Base 


A problem which often arises in connection with the determination of 
probabilities of various miss distances of bombs and missiles is the 
following: Let z and y be two normally and independently distributed 
orthogonal components of the miss distance, each with mean zero and 
with standard deviations o, and o,, respectively, where for convenience 
one labels the components so that o,>0,. Now for various values of 
c=o,/cz, it is required to determine (1) the probability P that the point 
of impact lies inside a circle with center at the target and radius Kez, 
and (2) the value of K such that the probability is P that the point of 
impact lies inside such a circle. Solutions of (1), for c=0.0(0.1) 1.0 and 
K =0.1 (0.1) 5.8, and (2), for the same values of c and P =0.5, 0.75, 0.9, 
0.95, 0.975, 0.99, 0.995, 0.9975, and 0.999, are given along with some 
hypothetical examples of the application of the tables. 


1, FORMULATION OF THE PROBLEM 


HIs problem, which is essentially that of finding the probability that a 
quadratic form does not exceed a specified value, has been considered by a 
number of writers (see references). In particular, Grad and Solomon [6] have 
given formulas for 2k and 2k+1 dimensions, together with brief four-place 
tables for two and three dimensions. For the two-dimensional case the following 
formulation, due to Fettis [3], is somewhat simpler. 
The joint probability density function of two orthogonal variables x and y 
which are normally and independently distributed, each with mean zero and 
with standard deviations o, and o,, respectively, is given by 


f(z,y) = (1) 


The probability that a point (x, y), whose coordinates are chosen randomly and 
independently from this joint distribution, will lie within a circle with center 
at the origin and radius Keo, is 


P(K,o,0,)= Jf f f(z, y)dxdy. (2) 
< Ke; 
If one introduces polar coordinates by letting x/o,=p cos 6 and y/o,=p sin 6, 
this probability takes the form 


P(K, oz, oy) 


Qe K 


If one lets c=o,/cz (without loss of generality, one may ensure that c<1 by 
calling the larger of the two standard deviations ¢,) and sets 26=4¢, one obtains 


K 
P(K,c) = f e~ od ad, (4) 
0 0 
723 
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If one then lets z=p?/4c?, (4) becomes 


P(K, c) = =f f cos (5) 
0 0 


Upon integration with respect to z, one obtains 
Qo 1 — (Khe?) 008 
0) = = f 
o (1+?) — (1 —c*) cos¢ 


It can easily be shown that equation (6) is equivalent to the result given by 
Grad and Solomon [6, equation (22) }. 


2. COMPUTATION OF THE TABLES 


(6) 


It was not found possible to perform in closed form the integration indicated 
in equation (6). Therefore it was necessary to resort to numerical integration, 
which was performed on the Burroughs E101-3 computer, employing the trape- 
zoidal rule. Values of cos ¢, accurate to eleven decimal places, were read into 
the machine by means of the tape input, while values of the exponential func- 
tion, accurate to within a unit in the tenth decimal place, were computed as 
needed. The probability P(K, c) was computed for c=0.1(0.1)1.0 and 
K=0.1(0.1)5.8. Eight decimal places were carried in the calculations, with 
error less than a unit in the seventh decimal place. The machine introduces a 
chopping error, since digits sufficiently far to the right of the decimal point are 
lost. If one or more digits are to be discarded, chopping may be replaced by 
rounding by the expedient of increasing by five the leftmost digit to be discarded 
and then chopping; this was done at, one stage of the computation in order to 
avoid an accumulation of chopping errors when the integrands were summed 
as required by the trapezoidal rule. The use of the trapezoidal rule introduces 
a truncation error which depends on the interval h(¢). It was possible in most 
cases to obtain the required accuracy by using an interval h:(¢)= 2/20 
(radian) = 9°. Smaller intervals were required, however, for high values of K 
combined with low values of c; specifically, the interval h2(¢)= 2/40 
(radian) = 4.5° was used force = 0.2, K = 4.0(0.1)5.8and for c=0.1, K = 1.2(0.1)3.5, 
while the interval h;(¢)=2/80 (radian)=2.25° was used for c=0.1, 
K =3.6(0.1)5.8. The results, rounded to seven decimal places, are shown in 
Table 1, which also includes values for c=0. The latter were obtained from a 
table of the normal probability function, since 

K 


P(K,0) = 26(K) — 1 = (2x)-3 f (7) 


—K 
It should be pointed out that if one sets c= 1 in (6), the result is 
P(K, 1) = 1 — e-*’/2, (8) 
Comparison of the eight-decimal-place results for c=1 with values found from 
(8) verifies that the error in the former is indeed less than a unit in the seventh 


decimal place. 
In order to determine, for c=0.1(0.1)0.9, the value of K corresponding to 
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P=0.5, 0.75, 0.9, 0.95, 0.975, 0.99, 0.995, 0.9975, and 0.999, it was necessary 
to interpolate inversely in the unrounded table of P(K, c). These computations 
were also performed on the E101-3, using Aitken’s method of interpolation 
with a tolerance of 510-8 and with provision for up to nine-point interpola- 
tion if the tolerance is not met for fewer points. In practice, the tolerance was 
never met, so that nine-point interpolation was always used; even so, from 
these results it was possible to determine five-decimal-place values of K only 
for small values of P (0.5 always, 0.75 usually, and 0.9 occasionally). Where 
this was not possible, direct interpolation was performed to determine P for 
the best estimate of K that could be obtained from the initial inverse inter- 
polation; then the inverse interpolation was repeated, using the estimated K 
and the corresponding P as coordinates of the first point and discarding the 
ninth point of the initial inverse interpolation. In this manner, it was possible to 
determine the required values of K, accurate to within a unit in the fifth decimal 
place. The results, rounded to five decimal places, are shown in Table 2, along 
with results for c=0, obtained by inverse interpolation in a table. of the normal 
probability function, and results for c=1, obtained by substituting numerical 
values of P in the equation 


K = ¥-2In(1 P), (9) 
which was found by solving (8) for K. 


38. COMPARISON WITH OTHER TABLES 


Grad and Solomon [6, Table I] have tabulated four-decimal-place values of 
the probability P(a:2,?+a.7.2<t), where z; and x2 are normally and inde- 
pendently distributed with zero means and unit variances, for eight selected 
pairs of values of a and as such that a>a, aita,=1 and for 
t=0.1/0.1)1.0(0.5)2.0(1.0)5.0. Table 1 of the present paper gives seven-decimal- 
place values of the probability P(cz,:?+2.2<K?), where again zx; and zz are 
normally and independently distributed with zero means and unit variances, 
for c=0.0(0.1)1.0 and K=0.0(0.1)5.8. The two probabilities are equivalent if 
one sets 


a2 


The pairs of values a2, a; and the corresponding values of ¢ are as follows: 
Q2, 5, .5 .6, .4 ay 3 9, .1 .95, .05 .99, .01 1,0 


c 1.0000 0.8165 0.6547 0.5000 0.3333 0.2294 0.1005 0.0000 


Table 1 of the present paper has eleven values of c spanning (0, 1) with a uni- 
form interval of 0.1. The maximum value of ¢ for the Grad-Solomon tables is 
5.0, corresponding to a maximum value of K ranging from 1/5 2.236 for a2=1 
to 10+3.162 for az=.5; in Table 1 of the present paper, values of K go up 
to 5.8 unless, for some lower value of K, the probability, when rounded to seven 
decimal places, becomes 1.0000000. Thus Table 1 of the present paper is con- 
siderably more extensive (621 entries as compared with 120) than the Grad- 
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Solomon table, as well as much more accurate (seven decimal places as com- 
pared with four). 

Grad and Solomon have no table analogous to Table 2 of the present paper, 
which gives the value of K corresponding to cumulative probability P for 
c=0.0(0.1)1.0 and P=0.5, 0.75, 0.9, 0.95. 0.975. 0.99, 0.995, 0.9975, and 0.999. 

Grad and Solomon [6, Table II] have also tabulated four-decimal-place 
values of the spherical error probability P(a:x,?+a272?+a323?<t) for nine se- 
lected combinations of a3, a2, and a; and for the same values of ¢ as in their 
Table I. 

Solomon [14] recently prepared more extensive and more accurate tables for 
both the two and three-dimensional cases, including tables analogous to Table 2 
of the present paper. 

The Numerical Analysis Department of the RAND Corporation [9] and 
Burington and May [1, pp. 102-5] have published tables of offset circle prob- 
abilities for the case ¢,=0,=0(c=1 or a,=a,=0.5), which give the probability 
of missing (or hitting) a circle of radius rg/o whose center is at a distance D/o 
from the origin. DiDonato and Jarnagin' recently published extensive tables 
of offset ellipse probabilities. 


4. APPLICATIONS 


The applications of the theory to the determination of hit probabilities for 
circular targets is well known (see, for example, papers by Scheffé [13], Fraser 
[4, 5], and Grad and Solomon [6]). A few somewhat oversimplified examples 


will be given, however, for the purpose of illustrating the use of the Tables. 
The values assigned to the parameters in these examples do not represent the 
capabilities of any actual weapon. 

Example 1. What is the probability that the miss distance of a bomb is less 
than 180 feet if there are no systematic errors and if the random errors in two 
orthogonal directions are normally and independently distributed with stand- 
ard deviations o,=120 feet and o,=72 feet? Solution: One has c=o,/¢, 
=72/120=0.6 and Ke,=180, so that K = 180/120=1.5. By reference to Table 
1, one finds the required probability P(1.5, 0.6) =0.8129287. 

Example 2. An ICBM has range errors and lateral errors which are normally 
and independently distributed with standard deviations o,=2 miles and o,=1 
mile, respectively. What is the radius of a circle which may be expected to 
include (a) 50% (b) 95% of the points of impact? Solution: One has c=o,=¢; 
=1/2=0.5. (a) By reference to Table 2, one finds that when c=0.5 and P=0.5, 
K =0.87042. Hence the radius of a circle containing 50% of the points of impact 
(called the circular probable error) is Ko, =0.87042(2) = 1.74084 miles. (b) For 
c=0.5 and P=0.95, K =2.03586. Hence the radius of a circle containing 95% 
of the points of impact is 2.03586(2) = 4.07172 miles. 

Example 3. In high altitude bombing, the range error and the lateral error 
are normally and independently distributed with standard deviations o, = 1000 


1 DiDonato, A. R. and Jarnagin, M. P., “Integration of the General Bivariate Gaussian Distribution over 
an Offset Ellipse,” Report No. 1710, U. 8. Naval Weapons Laboratory, Dahlgren, Virginia, 1960. 
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feet and o,=450 feet, respectively. Assuming no systematic errors, what is the 
probability that the miss distance is less than one-quarter mile (1320 feet)? 
Solution: One has c= and K = 1320/1000 = 1.32. Double 
interpolation gives the following values for P(1.32, 0.45): 0.7760240 (linear) ; 
0.7776033 (three-point Lagrangian) ; 0.7776141 (four-point Lagrangian). Thus, 
in this example, the results for three- and four-point Lagrangian interpolation 
agree to four decimal places, with a discrepancy of just over a unit in the fifth 
decimal place, while the result of linear interpolation disagrees with these by 
almost two units in the third decimal place. For practical purposes, three-point 
Lagrangian interpolation will almost always give satisfactory results, though 
linear interpolation often will not. 


5. ACKNOWLEDGMENT 


The author wishes to acknowledge the following contributions of colleagues. 
Gwon H. Lum and Orin P. Gard stimulated the author’s interest in the prob- 
lem; Henry E. Fettis wrote the circular error probability in a form (see [3]) 
well adapted to numerical integration, and also suggested the use of the trape- 
zoidal rule, which he has shown (see [2]) to be best for integrals of this type; 
Gertrude Blanch wrote the program for Aitken’s method of interpolation on 
the Burroughs E101-3 computer. Acknowledgment is also due to the referees 
and to the Acting Editor, David L. Wallace, all of whom made helpful sug- 
gestions, and to Herbert Solomon, who was kind enough to send the author a 
preliminary copy of his more extensive and more accurate tables. 


REFERENCES 


{1] Burington, Richard 8S. and May, Donald C., Jr., Handbook of Probability and Statistics 
with Tables, Handbook Publishers, Inc., Sandusky, Ohio, 1953. 

[2] Fettis, Henry E., “Numerical calculation of certain definite integrals by Poisson’s 
summation formula,” Mathematical Tables and Other Aids to Computation, 9 (1955), 
85-92. 

[3] Fettis, Henry E., “Some Mathematical Identities and Numerical Methods Relating 
to the Bivariate Normal Probability for Circular Regions,” Wright Air Development 
Center Technical Note 57-383, 1957. 

[4] Fraser, D. A. 8., “Generalized hit probabilities with a Gaussian target,” Annals of 
Mathematical Statistics, 22 (1951), 248-55. 

[5] Fraser, D. A. S., “Generalized hit probabilities with a Gaussian target, II,” Annals 
of Mathematical Statistics, 24 (1954), 288-94. 

{6] Grad, Arthur and Solomon, Herbert, “Distribution of quadratic forms and some ap- 
plications,” Annals of Mathematical Statistics, 26 (1955), 464-77. 

{7] Gurland, John, “Distribution of quadratic forms and ratios of quadratic forms,” 
Annals of Mathematical Statistics, 24 (1953), 416-27. 

{8] Hotelling, Harold, “Some new methods for distributions of quadratic forms,” Ab- 
stract, Annals of Mathematical Statistics, 19 (1948), 119. 

{9} Numerical Analysis Department, “Offset Circle Probabilities,” R-234, The RAND 
Corporation, Santa Monica, California, 1952. 

[10] Oberg, E. N., “Approximate formulas for the radii of circles which include a specified 
fraction of a normal bivariate distribution,” Annals of Mathematical Statistics, 18 
(1947), 442-7. 

{11] Robbins, Herbert, “The distribution of a definite quadratic form,” Annals of Mathe- 

matical Statistics, 19 (1948), 266-70. 


CIRCULAR ERROR PROBABILITIES 731 


{12] Robbins, Herbert and Pitman, E. J. G., “Application of the method of mixtures to 
quadratic forms in normal variates,” Annals of Mathematical Statistics, 20 (1949), 
552-60. . 

[13] Scheffé, Henry, Armor and Ordnance Report No. A-224, OSRD No. 1918, Div. 2, 

p. 60-1. 

[14] Solomon, Herbert, “Distribution of Quadratic Forms—Tables and Applications,” 
Technical Report No. 45, Applied Mathematics and Statistics Laboratories, Stanford 
University, 1960. 


EFFECT OF BIAS ON ESTIMATES OF THE 
CIRCULAR PROBABLE ERROR 


P. B. Moranpa 
Aeronutronic Division of Ford Motor Company 


Two estimates for the Circular Probable Error (CEP) are studied to 
determine their effectiveness when the aim point is shifted from the 
target center to another point. It is assumed that the distribution of 
points is bivariate circular normal, with mean value (or aim point) 
at m2) and variances ¢,* =0,? =o?. 

Both estimates are unbiased and asymptotically efficient when the 
aim point is at (0, 0). When the aim point is shifted, only one of the 
estimates is unbiased, but it requires estimation of the coordinates of 
the shifted aim point. If the amount of shift is known to be small, or 
suspected of being so, this estimate may not be as good as the biased 
estimate, which is employed as though the mean were (0,0). In this 
study the precision of the estimates is determined as well as the transi- 
tion bias which marks the point where one estimate becomes more 
effective than the other. 


1. INTRODUCTION 


N A previous study [3] the author compared four estimates of the Circular 
Probable Error (CEP) and, by way of justification for the study of estimates 
other than the best linear unbiased estimate, mentioned that one of the esti- 
mates is more robust than the best under variations in the mean of the distribu- 
tion, which in the previous study was taken to be bivariate circular normal with 
variance o? and mean (0, 0). In this paper a quantitative study of the effect of 
the variation in the mean on the precision of the estimates is made. 
Using the notation of the previous study [3], the estimates studied here are: 


I'(n) 


] n 


2 


CEP, = 1.1774./n —— 


CEP, = 1.1774/n / = > { (ei — + (yi — 

2n—-1 2n int 
(>) 

2 

where these estimates are based on a sample of n independent pairs of points 
ys), (2, Yo), (a, Yo) and and are the arithmetic means of the 2’s 
and y’s. 

According to Chapman and Robbins [1], CEP, has greater efficiency than 
any other unbiased linear sample statistic when the mean value is (0, 0). In 
case the mean is not zero but is known to be small, this estimate should be con- 
sidered. 

On the other hand, it is known that CEP, is asymptotically efficient what- 
ever the population mean may be; hence, if the mean is greatly different from 
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(0, 0), this estimate will be better than CEP. Because 2 degrees of freedom are 
lost in estimating the coordinates of the mean, this estimate will not be as pre- 


cise as CEP; for small biases. 


2. PRECISION OF ESTIMATES 


It is assumed that the joint density for the component errors is 


1 1 
where the mean value of this (circular) distribution is (m, me). Physically, this 
represents the distribution of hits about a biased aim point. 

The CEP can be defined either as the radius of the mean-centered circle 
encompassing 50% of the probability mass or as the radius of the origin-cen- 
tered circle with the same property. In direct analogy with the Probable Error 
of a single variable, the former alternative is the reasonable choice and is to be 
used here. 

According to (8) of the previous study [3], the mean square deviation of 


CEP, from the true CEP is 


Var (CEP:) = (1.1774)? | (n — -tle (2) 


C 
T? 

2 
where the use of the abbreviation Var for variance is proper since CEP; is an 
unbiased estimate of CEP as defined above. It is recalled that for the case of 
circular normal distributions, CEP = 1.1774. 

While variance is a customary choice for use in fixing precision of unbiased 
estimates, there is no established criterion for measuring precision of biased 
estimators. Nonetheless, a reasonable choice for such a measure is the mean 
square deviation of the estimate from the true value of the parameter esti- 


mated. This of course coincides with the variance for unbiased estimates. 
In order to compare the estimates on this basis it is necessary to compute 
E(CEP, — CEP)? = — 0)? 

= (1.1774)°[f2E (6.2) — + 


where 
I'(n) 


| | 
2n+1 


1 n 
= (2? + y,?). 
2n int 


For convenience put m=kio and m:=k2o so that the coordinates of the bias 
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are measured in terms of the standard deviation of the z and y errors. For any 
of the n components 


E(a2) = + ky?) (4) 
and 

E(y#) = + (5) 
hence 


Under the conditions placed on the variables, the statistic 
1 n 
— (2? + y*) 
i=l 


has the non-central x?-distribution. A very clear development of the non- 
central x? has been made by Mann [2]. The density of wu is 


1 


2 2™m!T'(m + n) 


n 
= > (ky? + ke?). 


exp | 2 & +a) 


> + n + 4) (7) 
+n) 


Putting for convenience r? = k,?+-k,*, (6) and (7) become 


= +n + 1/2 
E(é) = 4 (7a) 


Using (6a) and (7a), expression (3) can be evaluated. A convenient method 
of tabling the results is to choose ki =k, so that r?=2k,?, and let k; range from 
0(.1)1.0. This corresponds to a series of concentric circles which can serve as 
contour lines in the Cartesian plane (the point set over which the aim point 


where 
Hence 
and 
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TABLE I. MEAN SQUARE DEVIATION OF CEP; FROM CEP 
(CEP)? 


{In parentheses (E ((CEP, —CEP)?]+ E(CEP; —CEP)?])} 


0.0 0.1 0.2 04 O08 O06 O8 0.9 1.0 


0.00 0.02 0.08 0.18 0.32 6.50 0.72 0.98 1.28 1.62 2.00 


-1318 .1331 -.1990 .2320 +.2750 .3295 .3970 
(.482) (.487) (.503) (.530) (.575) (.639) (.728) (.849) (1.01) (1.21) (1.45) 


-0865 .0874 .0903 .0958 .1049 .1190 .1396 .1682 — _ - 
(.656) (.663) (.685) (.727) (.796) (.903) (1.06) (1.28) 


-0643 .0650 .0672 .0717 .0795 .0920 .1106 — 
(.743) (.751) (.777) (.829) (.919) (1.06) (1.28) 


-0512 .0517 .0536 .0574 .0644 .0759 — 
(.796) (.804) (.834) (.893) (1.00) (1.18) 


-0425 .0429 .0445 .0480 .0545 .0654 
(.830) (.838) (.869) (.937) (1.06) (1.28) 


-0363 .0367 .0381 .0413 .0474 — 
(.854) (.864) (.896) (.972) (1.12) 


-0317 .0321 .0333 .0363 .0421 
(.873) (.884) (.917) (1.00) (1.16) 


bias ranges). Values of expression (3) have been obtained for each n= 1(1)8 at a 
sufficient number of values of r* to determine the transition bias. In the ac- 
companying table the principal entries in all columns except the one on the ex- 


treme right are 1/(CEP)*? times the mean square deviation of CEP; from 
the true CEP; the column on the extreme right is 1/(CEP)? times the var- 
iance of CEP:. In parentheses under each entry the ratio E(CEP,—CEP)? 
+ E(CEP,—CEP)? is shown. 
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BIBLIOGRAPHY ON SIMULATION, GAMING, ARTIFICIAL 
INTELLIGENCE AND ALLIED TOPICS* 


Martin SHUBIK 
General Electric Company 


u1s bibliography contains references on the subjects of simulation, gaming, 

Monte Carlo methods, artificial intelligence and a miscellaneous category 
dealing with various writings on systems. The type of reference varies from 
exceedingly simple expository pieces to complex technical papers. 

There are obvious weaknesses in present classifications in these relatively 
new fields. For example, it is a matter of debate as to how one should distinguish 
between Monte Carlo and simulation (if the distinction is to be made at all). 

For certain purposes it may be desirable to break down the classification of 
simulation into “strategic simulation,” “tactical simulation,” analogue, high- 
speed digital computer or man-machine simulation. 

There are a host of internal company and military papers on these subjects, 
many of which are classified. No attempt has been made to quote all the 
RAND papers relevant. A large representative sample has been given. 

Examples of the further subclassifications of simulation are provided re- 
spectively by the work of Oreutt (I-A-3); Jennings (I-C-81 and 82) ; references 
I-C-6, 43 and 76 are to analogue simulation. Almost all the work noted here is 
high-speed digital computer simulation if it does not involve an analogue. A 
notable exception is the work of the Logistics Laboratory at RAND. See I-B-54 
to 60. 

Under the classification of “Simulation,” references I-C-51, 52, 68, 77, 87 and 
131 serve as introductory pieces. References C-31, 67 and 87 are perhaps of the 
highest value to those who wish to obtain an idea of the relevance of simulation 
to operations research and what are some of the techniques of carrying out a 
simulation of any system. There is still, however, a lack of a sufficient body of 
literature concerned with technical problems of simulation. 

References A-1, 3, C-27, 34, 38, 73, 74 and 75 provide examples of the use of 
simulation as a tool for economic theory and econometrics. 

C-33 and 66 should be classified more properly under gaming. The title of 
C-33, “Competitive Management Simulation,” provides an example of the lack 
of clarity that exists with respect to the definitions of gaming and simulation. 
The distinction used by this author is that gaming usually (though not always) 
makes use of a simulated environment to study the behavior of, or to teach 
individuals, while simulation is directed towards studying the behavior of a 
system given the behavior of the individual units or vice versa. Gaming always 
involves the presence of decision-makers. Simulation does not necessarily entail 
the involvement of individuals. In most instances a simulation involves only 
the machine manipulation of a model. 

The classification of gaming and allied topics can be broken down into experi- 
mental gaming and gaming for training purposes. Both of these categories can 
easily stand further subdivisions. For example, the experimental gaming can 


* The author is indebted to his tary, Miss Wilma Schwarzmann, for baving suffered through the hours of 
slave labor which go into the compilation of any bibliography. 
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be broken down into games which are primarily aimed at the study of economics 
social psychology, psychology or political science. The training games could 
be broken down to distinguish between those aimed at teaching physical skills 
(such as dexterity) and those aimed at illustrating principles or teaching “facts.” 

With respect to games for experimentation, there has been considerable con- 
cern about the effects of the environment in which the game takes place upon 
its value as an experimental tool. It is felt by some that for purposes of studying, 
say business behavior, the business game of the type described by Bellman 
et al. (II-C-7) is virtually useless because it is too highly multivariate and does 
not lend itself easily to the type of controls and analysis obtainable in well- 
designed control experiments. Others argue that any further simplifications 
would make the game so “unrealistic” that very little could be deduced about 
the behavior of individuals in actual situations from an analysis of the games. 

The distinction between “environment rich” and “environment poor” games 
is of further importance in the use of gaming as a training device for strategic 
purposes. The type of game called for by Goldhamer and Speier at RAND is 
relatively unstructured, calls for considerable role playing and for discussion 
by both the referees and the players as to the validity of the moves. For exam- 
ples of experimental gaming in economics utilizing a very closely controlled ex- 
periment, see Siegel and Fouraker (II-A-4). 

Under the title of “Gaming and Allied Topics,” references C-1, 7, 14, 56, 57, 
59, 65, 75 and 79 serve as introductory pieces. They require no previous knowl- 
edge or special training in order to read them. 

Several of the articles noted here are more closely related to small group 
studies than to gaming for training purposes. References such as C-10, 11, 31, 
32 and 35 belong to this category. 

For those interested in the construction of business games and in the tech- 
nical details that go into such construction, references C-38, 43 and 44 provide 
examples. 

Section III on Monte Carlo methods contains some expository pieces in 
references B-27, 29 and 31. The remaining references are more technical and 
deal with applications of this powerful branch of simulation. 

Section IV, entitled by the general term Systems, contains references pri- 
marily to work in economics dealing with structures, whose nature may be of 
general interest to those working in the areas of simulation or gaming. IV-B-2 
refers to a regularly published section of Behavioral Science which provides 
references to the general spectrum of computer applications to the behavioral 
sciences. 

Section V contains references to a new and highly promising field of study 
that has only recently come into being. This includes the work on artificial in- 
telligence, learning, chess-playing machines, self-organizing systems and, 
generally, the problem of building programs and/or machines which are flexible 
and capable of solving problems by general principles and search, rather than 
by a rigidly specified algorithm. 

Reference C-3 is a reprint of the presidential address given at the Midwestern 
Psychological Association. Its specific topic is learning theory. However, the 
detailed technical content is used as a medium to discuss the far more general 
problem of constructing models to explain human behavior. As such this paper 
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is highly recommended to those interested in simulation and/or gaming. 

Of all the bibliographies noted, all are or have been available upon request 
except I-B-1 which costs $12.00. The distinction between simulation and gam- 
ing is not clearly drawn in the bibliographies on simulation. Although I-B-1 
contains somewhat more annotations than I-B-2 and I-B-3 they appear to be 
of about comparable value. 

Those interested in the specialized topic of War Gaming will find [I-B-2 
most useful. 

The bibliography on Artificial Intelligence, V-B-1, contains references to 
other bibliographies and provides around 240 references to this fast growing 
area and to allied topics. 

An attempt to update this bibliography is being made. Information concern- 
ing articles omitted will be gratefully received. 
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lowa State, and the development of generalized classical linear estimates was suggested 
by a principle of C. F. Gauss, “The most plausible (sicherste) value of a quantity that is 
a given function of the unknown quantities in the problem, is found by substitution for 
the latter their most plausible values as determined by the Method of Least Squares.” 

ROBERT ERIC BECHHOFER, 41, wrote an article, “A Multiplicative Model for 
Analyzing Variances which are Affected by Several Factors,” for the June, 1960 issue of 
the Journal. His biographical note appears on p. 349 of that issue. 

LEO A. GOODMAN, 31, is co-author (with H. O. Hartley) of a paper, “The Pre- 
cision of Unbiased Ratio-Type Estimators,” which appeared in the June, 1958 issue of 
the Journal. A biographical note appears on p. 560 of that issue. Since that time, Goodman 
has spent the 1959-60 academic year with the Faculty of Mathematics, University of 
Cambridge, England, on leave from his position as Professor of Statistics and Sociology, 
University of Chicago. He expects to spend the forthcoming year as Visiting Professor of 
of Statistics and Sociology at Columbia University. 

EMIL J. GUMBEL, 69, has been Adjunct Professor of Industrial Engineering at Co- 
lumbia University since 1953. During this period he has spent summer terms as Visiting 
Professor at the Free University of West Berlin. His earlier positions include an Assistant 
Professorship at the University of Heidelberg (1930-3), a Research Professorship at the 
University of Lyon, France (1934-40), and a period as Professor in the Ecole Libre des 
Hantes Etudes in New York (1942-6). He has recently written Statistics of Extremes, 
Columbia University Press, 1958 and another JASA article, “Applications of the Circu- 
lar Normal Distribution,” (June 1954). Gumbel will participate in the June meeting of 
the International Statistical Institute in Tokyo. Also during the summer he will give 
papers on theory of extreme values at the Universities of Kyoto, Osaka, Tokyo, and 
Manila and lectures on mathematical statistics at Chulalongkorn University, Bangkok. 
Thailand. 

HARMAN LEON HARTER, 41, is a Mathematical Statistician at the Aeronautical 
Research Laboratories, Wright-Patterson Air Force Base. Before accepting this position 
in 1952, he taught in the Mathematics departments of Purdue University and Michigan 
State University. He majored in Mathematics at Carthage (Ill.) College and obtained a 
Master’s degree in Mathematics from the University of Illinois in 1941. He received his 
Ph.D. in Mathematical Statistics from Purdue in 1949. 

Harter’s principal current interests are in Design and Analysis of Experiments, Sta- 
tistical Tables, and Methods of Operations Research. He has written a number of Air 
Force Technical Papers as well as articles for the Annals of Mathematical Statistics, Bio- 
metrics, and the Journal of Chemical Physics. He was President of the Dayton Chapter of 
ASA in 1955-6. 

ROBERT VINCENT HOGG, 35, wrote an article, “Certain Uncorrelated Statistics,” 
for the June, 1960 issue of the Journal. His biographical note appears on p. 349 of that 
issue. 

ANNE SCHACHT LEE, 31, the wife of Everett S. Lee, has previously collaborated 
with her husband on articles for the American Sociological Review and Social Forces. She 
graduated from the University of Pennsylvania in 1950 and served as an Instructor in 
1956. 
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EVERETT S. LEE, 40, bas served as Associate Professor of Sociology at the Uni- 
versity of Pennsylvania, where he obtained his Ph.D. in 1952. He co-authored (with 
Benjamin Malzberg) Migration and Mental Disease, New York: Social Science Research 
Council, 1956, and has contributed extensively on demographic topics to other volumes 
and to journals. 

PAUL BENJAMIN MORANDA, 40, has been with Computer Operations, Aero- 
nutronic, Newport Beach, California, since February, 1960. Prior biographical notes 
appear on p. 809 of the December, 1959 issue of the Journal in connection with Moranda’s 
article, “Comparison of estimates of the circular probable error.” 

RALPH LOWELL NELSON, 34, received his Ph.D. in Economics from Columbia 
University in 1955 after taking his B.S. at Minnesota and his A.M. at Columbia. He 
taught at Adelphi College and Northwestern University and spent a year as Census 
Monographist for the Social Science Research Council before taking his present position 
as a member of the research staff of the National Bureau of Economic Research in 1959. 
Nelson is the author of Merger Movements In American Industry 1895-1956, Princeton 
University Press, for National Bureau of Economic Research, 1959. 

JERZY NEYMAN, 66, is one of the founders of modern statistics. Universally 
acclaimed for his development with E. 8. Pearson of the most widely accepted approach 
to testing hypotheses, he has also made notable contributions to theories of estimation 
and sampling and to the application of statistics to many different sciences. After studying 
at the Universities of Kharkov and Warsaw, Neyman served as Head of the Biometrics 
Laboratory at the Nencki Institute in Warsaw from 1923 to 1934. Following a period as 
lecturer and reader at University College, London, he became Professor of Mathematics 
at the University of California, Berkeley, in 1938. Three years later he accepted his present 
post of Director of the Statistical Laboratory of the University of California. His many 
excellent students have played leading roles in the development of statistical training and 
research throughout the world, and the Berkeley Symposia which he organized have 
stimulated some of the most fruitful exchanges of recent times among leading statistical 
workers. 

PETER JOHNSTON SANDIFORD, 45, is Director of Operations Research, Trans- 
Canada Air Lines, Montreal, Quebec. After taking his Ph.D. in Physics at the University 
of Toronto in 1952, he worked for the Hydro-Electric Power Commission of Ontario as a 
Physicist and Chairman of Operations Research Team, and for Price Waterhouse and 
Company, Canada as Director of Operations Research before taking his present position 
in 1957. 

Sandiford’s primary interests are operations research, statistical design of experi- 
ments, and computer simulations. He has had numerous articles in scientific and engi- 
neering journals. 

The approximation presented in his current article was in response to an inquiry by 
a partner of Price Waterhouse and Company regarding the size of a sample from a uni- 
verse of 5000 accounts required to detect a defalcation with stated probability. The 
author states, “I discovered that from the symmetry of the 2X2 contingency table two 
normal binomial approximations could be used. I asked myself the question, ‘How can 
you tell which on is better?’ and was led almost immediately to the idea of equating the 
mean and variance as described in the article.” 

MARTIN SHUBIK, 34, received his B.A. in Mathematics and his M.A. in Political 
Economy from the University of Toronto before taking his Ph.D. in Economics at Prince- 
ton University in 1953. From 1950 to 1953 he was a research assistant at Princeton Uni- 
versity, and from September 1951 to May 1952 he also served as a consultant for the 
Naval Air Development Center, Johnsville, Pennsylvania. From June 1953 to 1955 he 
was a research associate at Princeton. During the year 1955-6 he was a Fellow at the 
Center for Advanced Study in the Behavioral Sciences. He joined General Electric in his 
present capacity of Research Consultant in September of 1956. His publications include 
Readings in the Theory of Games and Political Behavior (Ed.), Doubleday, 1955 and Strategy 
and Market Structure, Wiley, 1959. 


CORRIGENDA 


Berkson, Joseph and Elveback, Lila, Comperina ExponentIAL Risks, WITH 
PARTICULAR REFERENCE TO THE StuDy OF SMOKING AND LuNG CANCER, 
Vol. 55, No. 291 (September 1960), 415-28. 


The authors have supplied the following corrections: 


Page 418, equation (12) should read = 
Table 2, pp. 425-6, the headings for columns 3 and 5 on both pages should 
read “o%” instead of “o}”. 


The following entries in Table 2 should be changed: 
Column 3, line 2 (not counting headings), page 425, should read: 
N’(1 — e-*’) 
Column 5, line 1, page 425, should read: 
N’ B’? 


Column 5, line 2, page 425: 


Column 5, line 3, page 425: 

“1 — 
Column 2, line 6, page 426: 

“1 —e*”. 
Column 3, line 1, page 426: 


Column 3, line 2, page 426: 


Column 3, line 3, page 426: 


Column 5, line 3, page 426: 


—28,’ 
ve 
2 
‘ “ase 
2 —28’ 
° 
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Borts, George H., RecionaL CycLes oF MANUFACTURING EMPLOYMENT IN 
THE UNITED States 1914-1953, Vol. 55, No. 289 (March 1960), 151-211. 


The following corrections were forwarded by Geoffrey H. Moore of the 
National Bureau of Economic Research. They have been checked by the author. 
Page 155, fn. 9: Tables 194, 196, 198 should read 206, 207, 208 
Page 172, line 14: Table 174 should be 174a 
Page 179, table: The superscript should be an asterisk to conform to the 
description in fn. 30. 


Page 183, line immediately below tabulation: Table 100 should read Table 
184. 


Halperin, Max, Extension oF THE WILCOXON-MANN-WHITNEY TEST TO 
SAMPLES CENSORED AT THE SAME Frxep Pornt, Vol. 55, No. 289 (March 
1960), 125-38. 

The author has pointed out that the expression, m—r,, appearing before the 

brackets in equation (3.5), p. 129, should be part of the subscript of P. 


The reference to Table 130 in the last sentence of section 4, p. 130, should 
read, “Table 136.” 


Mickey, M. R. Some Finite Poputation Ratio AND REGRESSION 
Estmators, Vol. 54, No. 287 (September 1959), 594-612. 


The author writes: “In reference to the portion of the paper dealing with 
sampling with variable probability, I stated, page 595, ‘Some new unbiased 
estimators are obtained, along with unbiased estimators of the variance of 
estimator.’ While this statement may have been appropriate to the Manu- 
script [1] originally submitted in 1954 and presented at the Iowa City regional 
meeting of the Institute of Mathematical Statistics, November, 1954, it was 
not appropriate to the published version. The results given on pages 604 and 
605 were published in this journal by Des Raj in 1956 [2]. The results on page 
606, which were not contained in the original manuscript, were published by 
M. N. Murthy in Sankhyd in 1957 [3]. 

“I acknowledge the priority of Messrs. Des Raj and Murthy, and thank 
Messrs. I. Fellegi and M. N. Murthy for pointing out these omissions.” 


REFERENCES { 


{1] Mickey, Ray, “Some Finite Population Unbiased Ratio and Regression Estimators,” 


unpublished manuscript reproduced at the Statistical Laboratory, Iowa State College, 
October 1954. 


{2] Des Raj, “Some Estimators in Sampling with Varying Probabilities without Replace- 
ment” Journal of the American Statistical Association, 51 (1956), 269-84. 

(3] Murthy, M. N., “Ordered and Unordered Estimators in Sampling without Replace- 
ment,” Sankhyd, 18 (1957), 379-90. 


Moranda, P. B. Comparison or EsTIMATES OF CIRCULAR PROBABLE ERRoR, 
Vol. 54, No. 288 (December 1959), 794-800. 
William Warntz and David Neft, both of the American Geographical Society, 
have pointed out that equation (4), p. 795 should read— 


“CEP = 1.17740” 
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and that the [ appearing in the denominator of the right side of equation (6) 
should read, “I.” The author concurs. 


Book Reviews, Vol. 55, No. 289 (March 1960) 

V. Lewis Bassie is the author of the book Economic Forecasting, reviewed 
by C. Ashley Wright on pp. 230-1. Mr. Bassie’s name was regrettably listed 
incorrectly both in the heading and the list of book reviews. 


EprrorrALt Vol. 54, No. 288 (December 1959) 

The editor is indebted to T. N. E. Greville of the Department of the Army 
for the following: 

Mr. Greville’s own affiliation should have been listed as “Department of the 
Army” instead of “Public Health Service.” 

The affiliation of Max Woodbury is “New York University” instead of “Uni- 
versity of Pennsylvania.” 

Michael Polanyi’s surname was misspelled “Polyani.” 

J. R. Vatnsdal was inadvertently listed “J. R. Vatnsadl.” 

These errors are sincerely regretted. 
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The Fertility of American Women. Wilson H. Grabill, Clyde V. Kiser, and Pascal K. 
Whelpton. New York: John Wiley and Sons, Inc., 1958. Pp. xvi, 448. $9.50. 


Otis DupLteY Duncan, University of Chicago 


ost of the currently available information on fertility trends and differentials in 

the United States is summarized in this workmanlike addition to the Census 
Monograph Series sponsored by the Social Science Research Council and the Bureau 
of the Census. Its comprehensive coverage makes it indispensable as a sourcebook 
and guide to both census and vital statistics data and useful as a point of departure 
for more intensive research. Numerous kinds of fertility measures are employed; 
statements of their definitions and properties, together with illustrations of their 
application, constitute valuable methodological material. 

Opening their analysis with a resumé of historical trends, the authors document the 
exceedingly high fertility of the Colonial period and the subsequent decline during 
the nineteenth and first third of the twentieth century. It is shown that decreases in 
rural fertility accounted for over half the national decline between 1810 and 1940, 
although early-established rural-urban and regional differentials persisted through- 
out the period. The recovery of the birth rate from its low point around 1933-36 in- 
volved no reversal of the falling trend in the proportion of very large completed 
families. Recent cohorts of women, as they complete the child-bearing cycle, how- 
ever, will show somewhat larger average numbers of children ever born than did those 
whose child-bearing years centered on the 1930’s. 

’ Although attention is given to fertility differentials, both early (where data are 
available) and recent, by such factors as residence, race, nativity, income, rent, 
religion, marriage duration, and labor force status, the greater amount of space is 
devoted to the 1910, 1940, and 1950 census materials on number of young children 
and number of children ever born, by occupation of husband, and the 1940 and 1950 
data, by number of school years completed. While occupational and educational 
differentials are seen, by and large, to have maintained the pattern of inverse rela- 
tionship to socio-economic status in all three years, the general rise in fertility re- 
corded between 1940 and 1950 is said to have resulted in a “narrowing of differentials” 
consistent with “the principle of relatively greatest rebound in the lowest cumulative 
fertility rates.” This statement of findings, while correct in its technica] context, is 
perhaps overemphasized by the preoccupation with percentage (relative) changes 
in fertility and with relative, rather than absolute, measures of group deviations 
from central tendency. The authors might, with advantage, have applied more con- 
sistently the cohort viewpoint—developed in a later chapter—to the data on differ- 
entials. As compared with their rather unselective reporting of a confusing variety 
of measures of differentials in both current and cumulative fertility, this would have 
involved (a) more attention to the relatively “fixed” characteristic, educational 
attainment, and lesser attention to occupation, the interpretation of which is com- 
plicated by social mobility; (b) greater emphasis on cumulative fertility of cohorts 
and the time pattern of cumulation; (c) sharper focus on total, rather than marital, 
fertility, with proportions marrying being regarded as a component of total fertility; 
and (d) clearer distinctions between the analysis of fertility differentials as compo- 
nents of population change and the analysis of their bearing on fertility variation and 
change as such—recognizing that not all types of differentials are equally relevant in 
both contexts. Had this approach been maintained, I think the conclusion would 
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have been that it is premature to get very excited about a “narrowing of differentials,” 
inasmuch as the cohorts in which such a tendency is most apparent,,were far from 
the end of the child-bearing period by 1950. The forthcoming 1960 data ought to 
provide a much clearer answer to what happened to differential fertility as a con- 
sequence of the “baby boom.” 

The mechanism of the “baby boom” itself is clarified considerably by Whelpton’s 
summary of cohort tables, presented at some length in Chapter 9. An illustrative 
calculation suggests that the difference in crude birth rates between 1945-54 and 
1930-39 was attributable in large measure to higher proportions marrying at early 
ages, in slightly larger measure to increase in proportions of married women having 
a first child by given ages, and only quite secondarily to the bearing of additional 
children by single-parity women at given ages, i.e., to “families getting larger.” Im- 
pressed with the flexible response of family-building behavior to transient circum- 
stances, the authors suggest that possible eventualities in cohort fertility are con- 
sistent with an even wider range in future crude birth rates (to 1975) than is implied 
by recent population projections of the Bureau of the Census. 

The volume serves well what surely is the primary function of a census mono- 
graph: to provide as full a descriptive exposition (some clumsy graphic presentation 
notwithstanding) of the relevant data as space allows, with meticulous formal 
demographic analysis of the variation in fertility over space, time, and social cate- 
gories of the population. For the reader with a broader scientific interest in the sub- 
ject, however, the monograph must be taken only as a starting point for fashioning 
an explanation of historical patterns and trends and, hopefully, a theory predictive 
of the variations likely to be observed under abstractly stated conditions. 


Studies in Linear and Non-Linear Programming. Kenneth J. Arrow, Leonid Hurwicz 
and Hirofumi Uzawa, with contributions by H. B. Chenery, S. M. Johnson, S. Karlin, 
T. Marschak, and R. M. Solow. Stanford, California: Stanford University Press, 1958. 
Pp. iv, 229. $7.50. 


H. O. Hartiey, State University 


= book gives an account of modern developments in Linear and Non-linear 
programming. Although it comprises contributions from eight leading exponents 
on the subject, it is uniform in notation and style of presentation. The book is ar- 
ranged in three parts. Part one gives a concise development of the theorems con- 
cerning the existence of the solutions of problems in mathematical programming as 
well as their relationships. Emphasis is placed upon generalizations of the KUHN- 
TUCKER theorem in concave programming and the SADDLE-POINT theory. Of 
particular interest is the discussion of problems involving an infinite number of ac- 
tivities, as in the problem of optimum allocation of resources over an infinite number 
of time periods, sometimes, although not here, called “dynamic programming.” 

Part two undoubtedly constitutes the heart of the book. For it gives a complete 
account of the “Gradient Method” (mainly) for concave non-linear programming. 
This “Gradient Method” has similarities to the well known method of steepest ascent 
which is used to find the maximum of a mathematical function f(z) in an n dimen- 
sional space z. The following distinctions should, however, be noted: 

a. The “Gradient Method” employs the “Lagrangian functional” in place of the 
objective function f(z) and accordingly enlarges the dimension of the activity space 
(x, y) to n+m dimensions where m is the number of concave constraints. 

b. The Gradient Method makes “differential steps” in the direction of steepest 
ascent, and hence follows the one dimensional path of a system of 1st order non-linear 
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differential equations in which the “step” or “time” variable ¢ is the independent 
variable corresponding to the time sequence of the iteration process. 

c. Whilst the concave restraints are accounted for by switching to the Lagrangian, 
the non-negativeness of the vaiiables has to be especially ensured by a discontinuous 
modification of the system of differential equations. 

The major contribution of this work is the proof of the convergence of the point 
z(t), y(t) as the Gradient Method trajectory is continued indefinitely (t+) to the 
saddle point of the Lagrangian which is the global solution of the non-linear pro- 
gramming problem. The conditions for this global convergence are that f(z) should 
be strictly concave and the non-linear constraints, which are of the form g;(z)20 
(j=1, 2, -- +, m) should be concave. Ironically, this method, which covers perhaps 
the most general situation in non-linear programming so far treated in the literature, 
does not embrace linear programming as a special case. The authors do, however, 
prove certain properties of their Gradient Method under relaxed conditions includ- 
ing the convergence to local maxima of f(z) “in the small.” Moreover their recent 
work indicates that even the concaveness of their functionals is not as vital to the 
solution as this book would make it appear to be. 

A possible weakness of the book is that it is so exclusively concerned with the 
authors “Gradient Method” that alternative procedures which are now available in 
special cases are somewhat neglected. Apart from the considerable number of tech- 
niques for “Quadratic Programming” there are now available alternative methods in 
fairly general situations of non-linear programming, some of which also employ the 
concept of the gradient of the functional f(z). (See e.g. Rosen (1960) and Zoutendijk 
(1959).) Further, more recently “simplex-like” procedures have been reported (e.g. 
the approach by J. E. Kelley was announced at the summer meeting of the Econ- 
ometric Society in 1958 and the recent (1959) Rand Symposion contains a whole 
section on new non-linear programming methods). By comparison with alternative 
methods some comments on the computational aspects may be appropriate: With a 
new method such as the Gradient Method, experiences of using it on large scale 
problems are by necessity limited. However, it is already apparent that the inherent 
weakness of the Gradient Method is that it is proceeding at “differential steps” and 
that, therefore, the number of steps is likely to be large. This will be particularly 
wasteful if the optimum is in fact on the “boundary” of the z-space as numerous 
steps in the interior may be required to reach a boundary point whilst “simplex-like” 
procedures would in such situations immediately proceed to the boundary. Even 
in situations where the optimum is an interior point of the z-space, the Gradient 
Method would in most situations turn out to be a “slow” method compared with 
certain available methods of solving the non-linear system of first differentials 
Of/dx;=0. However, this should in no way detract from the great asset of this 
method which resides in the great generality of the problems for which it provides a 
method of solution. 

Part three is concerned with four selected problems in linear and quadratic pro- 
gramming. The first of these (section 12) gives an alternative to the simplex method 
of solving small-sized linear programming problems by evaluating all “extreme vec- 
tors” and the associated values of the objective fuactions and choosing the or a 
maximum from among these. The method is conceptually interesting but becomes 
unmanagable for larger-sized problems. The next section (13) deals with a special 
situation of dynamic programming in price speculation. The third section (14) offers 
an algorithm for solving or improving the allocation of “machines” to “tasks” in a 
hypothetical model of “process analysis.” In the fourth section (15), the Gradient 
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Method is applied (in a modified iterative form) to a problem of economic planning 
involving quadratic functions. 

It is clear that this book constitutes a most valuable contribution to the frontier 
research in the area of mathematical programming. At the same time it must be 
stated that it cannot be regarded as giving a comprehensive exposition of this area 
in textbook fashion. 


International Journal of Abstracts, Statistical Theory and Method. Jnternational Statisti- 
cal Institute. London: Oliver and Boyd. (A quarterly journal with Volume 1, Number 1, 
dated July 1959.) £5 or $16 per year; single copies 30 s. or $4.50. 


K. J. Annouip, Michigan State University 


HE modern world is placing an increasing emphasis on activities which have 

traditionally led to publication and the number of papers appearing increases at 
an increasing rate. The difficulty each of us has in scanning the publications in his 
own field is compounded by an increasing impatience to learn immediately of the 
reported advances. An increase in the number of abstracting journals is desirable 
and inevitable. 

Statistics has long been in need of some systematic collection of references to im- 
portant theoretical papers, some of which, for one reason or another, appear in 
journals not regularly scanned by statisticians. But, the field of statistics has not 
been well delineated and its solid core has only recently gained wide recognition as 
such. To be successful, an abstracting journal in statistics must restrict its field of 
interest. The two abstracting journals which involve statistics and which have had 
some success, /nternational Journal of Abstracts on Statistical Methods in Industry and 
Quality Control and Applied Statistics (abstracting service) have chosen an economically 
important area of applied statistics on which to concentrate although they have not 
neglected theoretical papers. Other fields of applied statistics are to a degree covered 
as segments of the field of applications, e.g. the social sciences, electrical engineering, 
and computing. Mathematical statisticians have found their immediate interests 
covered by Mathematical Reviews and by Zentralblatt fiir Mathematik. Up till now, 
however, no abstracting journal has attempted to cover the range of interests repre- 
sented by departments of statistics as they have been constituted in American uni- 
versities. Statistical Theory Abstracts aims at about the middle of this range. The 
first sentence of Volume 1, Number 1 reads “The aim of this journal of abstracts is 
to give complete coverage of papers in the field of statistical theory and new con- 
tributions to statistical methods.” 

As with the mathematical reviews, the entries in Statistical Theory Abstracts are 
mostly abstracts but there are occasionally reviews, and we can hope that reviews 
will appear in place of abstracts where the needs of the reader will be better served 
thereby. That the editors are more interested in abstracts than reviews is clear when 
it is noted that the abstractors are usually from that part of the world in which the 
paper was published and are not infrequently the authors themselves. 

The reviews are grouped into 12 major categories and arranged by these categories 
on pages of 12 different colors. Each paper is further categorized into one of nine 
or 10 subcategories within the major category. In addition to this primary classifica- 
tion, the content of the paper is further indicated by a secondary classification within 
the same classification scheme, the code number for the secondary classification ap- 
pearing beside the code for the primary classification. Sheets are printed on one side 
only with upper and lower half page each devoted to one abstract. Holes are punched 
on the left to facilitate use in loose leaf folders and the 10-inch tall pages are divided 
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into two parts by a horizontal rule which facilitates cutting into separate five-inch by 
eight-inch sheets each containing a single abstract. 

It may be proper for arfabstracting journal to set up categories in such a way that 
each category will be occupied by approximately the same number of abstracts. 
Value judgments are thereby nearly avoided. Regardless of the ingenuity of the 
editors, it is impossible to set up a coding scheme which will serve the needs of the 
readers forever, emphases will change with time. In the present scheme some cate- 
gories represent fields which did not exist twenty years ago. The major category, 
“Stochastic Theory and Time Series Analysis,” may require further breakdown soon, 
it is the most populous in the first two issues of the journal. 

The editors invite readers to suggest additions to the long list of journals now 
being scanned for papers deserving of being abstracted. This tentative list is a bit 
uneven. We cannot complain of the inclusion of journals which are unlikely ever to 
be represented by abstracts but there are some serious omissions. 

Statistical Theory Abstracts will undoubtedly be of great value to a goodly number of 
statisticians and will be scanned by many more. Let us hope the editors can keep up 
the excellent quality of the first two issues. The list of editors and abstractors is im- 
pressive and augurs well for the success of the journal. 


The State and Economic Growth. Committee on Economic Growth. New York, New York: 
Social Science Research Council, 1959. Pp. x, 389. $3.75. 


G. Uco Papi, University of Rome 


His is a book overflowing with interest, able to diffuse a deep knowledge of facts 

to the benefit of all who seek an empirical basis for theories of economic develop- 
ment. It is a book which reproduces the results of entrusting to different authors 
research on the lines followed by the economic development of the most widely dif- 
fering countries, distributing the results obtained by each author considerably in 
advance, and calling a conference for discussion of the conclusions. 

In order to organize such a complex work, Professor Hoselitz of the University of 
Chicago classified the various countries according to certain characteristics: ex- 
panding economy, where unused land as it appears can be made more profitable by 
better utilization of productive factors; dominant economy, an economy independent 
of foreign countries; autonomous economy, in which economic decision units are sep- 
arate from those which make political decisions; contrary to an induced economy, 
where the same organisms make both political and economic decisions. 

Eight countries were considered: the United States by Henry W. Broude; Aus- 
tralia by Noel G. Butlin; Canada by Hugh G. J. Aitken; Russia by George Barr Car- 
son, Jr.; Manchuria by Edwin P. Reubens; Germany by Norman G. J. Pound; 
Switzerland by Alfred Burgin; Turkey by Robert W. Kenwin. William N. Parker 
examined the ¢ »velopment of mining activities in Germany and France. The function 
of the State in the economic development of Eastern Europe from 1860 was examined 
by Nicolas Spulber. Richard Hartshorne prepared a valuable general essay to bring 
into focus four “material” factors of economic development (transport, sources of 
energy, raw materials, capital) and four “human” factors (capacity for enterprise, 
qualified labor, inclination to save, organization of production for the market); and 
ascertained in the widely differing countries an almost compulsory “sequence” of 
development from agricultural to industrial activity to the activity of producing 
services. 

Hoselitz, for his part, has enriched the volume with a survey in which he identifies 
economic development with an increase of productivity. He shows how this increase 
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can be reached either by improving the quality of human labor, accumulating sav- 
ings for production, or improving the combination of the factors. In all three ap- 
proaches, the State is of great importance. Joseph Spengler summarizes, comments and 
compares the results of the analyses and debates. 

According to Hoselitz, the United States fits several classifications: a) an erpanding 
economy—in view of the almost unlimited availability of natural resources; b) a 
dominant economy—because it is for the most part self-sufficient, even though during 
the whole of the 19th century the flow of European capital considerably qualifies 
this dominance; and c) an autonomous economy—in view of the presumed minimum 
intervention of the State. 

Certainly the United States represents the type of society founded on the achieve- 
ment of maximum profit, given the system of wages and market prices, the absolutely 
free employment of productive factors, the creation of credit, the supply of capital, 
and the absence of restrictions on business activities, entrusted for the most part to 
private enterprise. However, the Government has controlled immigration, concerning 
itself with the labor supply; intervened in the working of the banking system; pro- 
tected productive activities by means of customs barriers and import duties; produced 
public services which individuals could not themselves produce; and reinforced the 
legal structure within which private enterprise has the possibility of operating with 
greater security. Such considerations lead Broude to express doubts of the pretended 
autonomy in the development of the United States. 

Burgin shows that even in the case of Switzerland it would be difficult to maintain 
that today’s standard of living is due entirely to a free economy. Situated at the cross- 
roads of vital arteries of communication, solidly entrenched in the textile industry, 
Switzerland was able to utilize large accumulations of capital drawn to her by a 
wide trade. When the Thirty Years’ War destroyed the prosperity of Germany, she 
was able to take the latter’s place, because capital and qualified labor flow spon- 
taneously towards Switzerland. Religious refugees also brought advanced knowledge 
of productive methods and an enterprising spirit. The Federal Constitution of 1848 
represented the outcome of a process of about six hundred years during which em- 
phasis was placed on constant identity between State and private interests. The 
State has continued to concern itself with the integration of the economic interests 
of the citizens, a very different philosophy from that usually known as “laissez faire.” 

Hartshorne analyzes development in terms of certain statistical indicators: a) 
income per head; b) the percentage of non-agricultural activities, c) the use of energy. 
He classifies the countries of the world into three categories: advanced countries— 
barely one-six of the world population; underdeveloped countries—two-thirds, un- 
fortunately, of the world population; countries in an intermediate stage—less than 
one-sixth of the world population. He shows how “commercialization” of agricultural 
products in areas where previously a “subsistence” economy predominated not only 
brings about specialized labor in transport and commercial activities, but develops 
industries for the transformation of agricultural products, encourages the manu- 
facture of forestry products, mineral industries and textiles. 

In view of the progressive creation of total demand, it becomes of enormous im- 
portance to increase agricultural productivity. The points of attack on the problem of 
development are, then, on the one hand, produce for the market; on the other, 
produce at lower cost. Since agricultural production at lower cost can generally be 
obtained only if a large part of the population working the land is able to move into 
other sectors, government has to multiply its efforts for training and professionaliza- 
tion of labor. 

“The book is truly instructive on the importance of governments in promoting 
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economic development. Unfortunately an analysis of the consequences of such 
actions is missing. The contribution of the book is that of an accurate presentation 
and classification of considerably refined material, without theoretical analysis. Full 
of facts and documentation, the book is a useful complement to those very rare books 
of theory on economic development—above all to the even rarer books of theory on 
the economic conduct of the State—which have appeared for some years in Europe. 
There is no doubt that development of theories can benefit from contributions like 
this, prepared with method, objectivity and organic vision of the innumerable 
phenomena of economic development. 


Merger Movements in American Industry, 1895-1956. Ralph L. Nelson. Princeton: 
Princeton University Press, 1959, for the National Bureau of Economic Research. Pp. 
xxi, 177. $5.00. 


Jesse W. Marxuam, Princeton University 


_ small volume fills important and long-standing gaps in the quantitative data 
on merger activity: The four time series on the great turn of the century merger 
movement compiled earlier by Moddy, Conant, Watkins and the Census Bureau 
had different terminal dates and all were suspected of errors and omissions; data for 
the 1904-18 period had never been assembled; the Thorp series for 1919-39 and the 
Federal Trade Commission series for 1940-56 were generally considered to be re- 
liable, but the latter was for firm disappearances through merger and hence not com- 
parable to any of the turn of the century séries based on consolidations. 

The principal contribution Nelson makes is his 1895-1920 annual and quarterly 
series on consolidations, firm disappearances by consolidation, and firm disappear- 
ances by acquisition. He also measures the size of consolidations and acquisitions in 
terms of capitalization, and the incidence of merger activity by two-digit Standard 
Industrial Classification categories. This is no small contribution: Nelson shows the 
turn of the century consolidation wave to have been from one-fifth (in terms of firm 
disappearances) to over one-fourth (in terms of capitalization) larger than any 
previous estimate had shown it to be, and provides a series on firm disappearances 
for the entire period 1895-1956 (unspliced at 1919) which can be correlated with 
other measures of economic activity. He finds merger activity to be correlated with 
both the index of industrial production and the index of industrial stock prices, but 
much more highly correlated with stock prices than with production. His results 
thus agree with those reached earlier by Weston on the basis of less complete data.' 

Nelson examines some of the earlier identified causal forces in the 1895-1902 com- 
bination wave in the light of his new data. He finds merger activity correlated with 
business expansion and thus rejects the hypothesis that mergers are brought on by 
retardation in the rate of business growth. He is unable to correlate merger activity 
with indexes of market power but concludes that some of the large combinations 
obviously had market power as their objective. He finds that merger activity is corre- 
lated with high-transportation cost industries, but at the same time more highly 
correlated with those geographically concentrated than with those widely dispersed; 
accordingly, he concludes that increased competition brought on by railroad expan- 
sion was not a major factor. The analysis here is not convincing: business firms heavily 
concentrated at two production centers obviously may have as great an incentive to 
merge as the same number of firms geographically dispersed; e.g., U. S. Steel’s ac- 
quisition of Tennessee Coal and Iron. Finally, because merger activity in the early 
period is highly correlated with stock price movements, Nelson concludes that or- 


1 Cf. J. Frederick Weston, The Role of Mergers in the Growth of Large Firms (University of California Press, 
1953). 
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ganizational developments in the capital market had much to do with the early com- 
bination wave. 

The principal criticism to which Nelson’s statistical technique is vulnerable con- 
cerns his use of cut-off points. His 1895-1914 series omits consolidations of less than 
$1 million and acquisitions of less than $35,000; for the 1915-20 period the cut-off 
size was doubled on the rather weak grounds that between 1904 and 1919 the price 
index of the book value of manufactured capital rose 75 per cent. The procedure 
obviously injects a large discontinuity in the series around 1914 and 1915, and re- 
grettably leads Nelson to exclude the fairly numerous small consolidations (22 be- 
tween 1895 and 1914) and acquisitions (117 between 1895 and 1914) from his anal- 
ysis. Nelson’s study, as was the case of earlier studies, is therefore biased toward 
the large mergers. He justifies his cut-off points on the grounds that rising prices in- 
duced a rising built-in cut-off in the dollar size of reported mergers. It could as 
justifiably be argued that the rising interest in the subject leading to more complete 
reporting in the financial journals offset this upward trend. 

Nelson has unquestionably made a solid statistical contribution on a subject in 
need of analysis in quantitative terms. Students of industrial organization and bus- 
iness cycles alike are indebted to him for what must have been an arduous task. 


The Allocation of Economic Resources; Essays in Honor of Bernard Francis Haley. 
Moses Abramovitz and others. Stanford, California: Stanford University Press, 1959. Pp. ix, 
244. $5.00. 


Anpo, Massachusetts Institute of Technology 


REVIEWER Of a collection of essays such as this one has only two choices: either 

to describe each essay very briefly, or to present a somewhat detailed discussion 
of one or two essays, neglecting all others. While I have been delayed in my task for 
personal reasons, an excellent over-all review of the book has appeared,' and I feel 
free to concentrate on two essays that attracted my attention. 

As a student of monetary theory, I was particularly interested in Professor Shaw’s 
essay, though it appeared elsewhere previously.? I have some sympathy for Shaw’s 
impatience with the past performance of the Federal Reserve System and Treasury, 
but I am afraid that this is about as far as my agreement with Shaw goes. To judge 
the performance of a specific policy mechanism, one must have some specific criteria, 
and though Shaw is not explicit, it seems safe to assume that he considers a stable 
growth of output and employment the major goal of the economic system, a goal 
with which I have no quarrel. Now I know that Shaw is too good an economist to 
suggest that a set of simple figures on the rate of increase of money supply enables 
him to make any inference on the relationship between the monetary mechanisms 
and the fluctuations of output and employment. On the theoretical side, he seems to 
suggest that expression P(kT) describes the demand for money adequately, where P 
is the price level of goods and services, 7 is the real national income, and k is the 
reciprocal of velocity. Contrary to his assertion, we know that k has varied enough 
to make the fluctuation of national income considerable even if the money supply 
were increased at constant rate. Even Milton Friedman, probably the staunchest 
defender of the idea of a constant velocity, would agree that the ratio of money supply 
to current income varies considerably. It is only when the concepts of permanent in- 
come and permanent price level are substituted for those of current income and 


1 Review by Dorfman, Robert, American Economic Review, December, 1959, pp. 1055-8. 
* Shaw, Edward 8., “Money Supply and Stable Economie Growth,” United States Monetary Policy, Sym- 
posium of American Assembly, December, 1958, 
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current price level that Friedman would assert the constancy of the velocity.’ I fail 
to see how Shaw’s theoretical structure aids his contention that the automatic rule of 
increasing the money supply at a constant rate would insure a steady growth of em- 
ployment and output. 

I would suggest that the failure of monetary policy is not due to the discretion left 
for the monetary authorities, but rather due to the insufficient attent’.n given by 
those exercising the discretion to the time lags that exist between the time the policy 
actions are taken by the authorities and the time at which the effects of the policy 
are beginning to be felt. The recognition of such lags would lead to the emphasis on 
the importance of co-ordinating the effects of the history of past policies and the 
anticipated effects of current policy. 

I now turn to the essay by Armen Alchian, “Cost and Output.” He starts his 
essay with the remark, “Obscurities, ambiguities and errors exist in cost and supply 
analysis despite, or because of, the immense literature on the subject. . . . Proposi- 
tions designed to eliminate some of these ambiguities and errors are presented in 
this paper.” [Italics by the reviewer.] [t seems to me that he has succeeded in increas- 
ing ambiguities, if not errors, rather than eliminating them for the readers of his 
paper. The basic relation on which Alchian builds his propositions is 

T+m 


V = 
T 


where V, z(t), T and 7'+m are, respectively, total output, the rate of output at time 
t, the time at which output begins to flow, and the time at which the production 
ceases. 
Because of the limitations on space, I shall discuss only Proposition I of Alchian 
in some detail. It is . 


dx(t) Ir-r, 


where C is the discounted present value of the total cost. Now this is subject to a 
variety of interpretations. It would be meaningless if x were in fact a function of 
time, as the notation indicates. However, from the context, it is clear that z(t) is 
assumed to be constant for 7 <t<7+m, so that the left hand side of the above in- 
equality is, in spite of its bizarre appearance, a simple partial derivative. It seems to 
me that it would be surprising if there did not exist some range of z for which this 
proposition is true, but it would be equally surprising if this proposition is true for 
all ranges of z. When z becomes very small, by lowering z still further, the total un- 
discounted cost cannot be reduced very much but is more likely to increase. But, 
under Alchian’s definition of terms, m must increase when z is lowered while V and 
T are kept constant. Hence, the cost as measured by changes in the present value of — 
equity would almost certainly increase as the result of discounting. Thus, Proposition 
I cannot be true for a sufficiently low initial value of z. 

Another startling example of very careless reasoning by Alchian on a subject that 
requires most careful reasoning is on his Figure 1 on page 27. Alchian does not seem 
to be disturbed by the implication of his figure that there is a positive cost associated 
with positive volume of total production while the rate of output is at all times zero. 

I have stressed the points about which I felt most uneasy in only two out of many 
essays in this collection. On the whole, I must report that reading through this volume 
is a very informative, rewarding experience, and lack of comment on the remaining 
essays in this review should not be taken as an indication of lack of stimulation. 


* Friedman, Milton, “The Demand for Money: Some Theoretical and Empirical Results,” The Journal of Politi- 
cal Economy, August, 1959. 
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Collecting Financial Data By Consumer Panel Techniques. (Studies in Consumer Behavior, 
No. 1.) Robert Ferber. Urbana, Illinois: Bureau of Economic and Business Research, Uni- 
versity of Illinois, 1959. Pp. x, 177. $2.50. ‘ 


James Duncan Suarrer, Michigan State University 


nis is the first of a series of technical reports, dealing with the collection and 
‘Receieniies of financial data obtained from consumers, promised by the Inter- 
University Committee for Research on Consumer Behavior. The present report 
summarizes the procedures and the methodological results from a series of five in- 
terviews with the same group of families. The interviews were concerned primarily 
with financial data related to savings. Oxer 100 pages of the book are devoted to 
appendices reproducing the sample design, letters to respondents, instructions to 
interviewers, and the questionnaires used. 

Topics discussed included the following: The rate of response (Four out of five 
households contacted gave some information on saving behavior. The mortality rate 
for the panel was 24% over the five waves of interviews); the nature of refusals; the 
influence of interviewers; the field costs (It was estimated that the field costs for a 
panel interviewed four t‘mes a year would be $60 per family per year); influence of 
introductory letters; effect of gifts and other incentives; rapport; use of a continuous 
diary; methods of combining reports to improve reliability of estimates (The use of 
regression analysis to obtain estimates of missing data, was discussed); and com- 
pleteness of reporting. The study showed that repeated interviewing turns up new 
facts which should have been reported earlier. This raises a question about accuracy 
of financial data obtained by single interview. 

Providing detailed financial information involves a good deal of time and effort 
on the part of the respondent. The ideal would be a representative sample reporting 
periodically by mail with the reports based upon actual financial documents. Actually 
an effort was made to get a few families to keep a financial diary, with limited success. 
The main reason given for not keeping the diary was the work involved. They were 
not offered pay for doing the work. A good project for future testing would be that 
of offering a substantial payment for keeping the necessary records and reporting by 
mail. With the cost of interviews running over $15 each it might be cheaper. 

The major limitation of the study was the small sample which severely curtailed 
the scope of the analysis. Only 88 savings units reported on all five survey waves. In- 
clusion of supporting information from related studies, such as those connected with 
the Federal Reserve Survey of Consumer Finances, would have increased the value 
of the report. 

Obtaining accurate financial information by survey requires careful attention to a 
great many details. This little book should help anyone attempting research in this 
area. 


Consumer Expectations, Plans and Purchases: A Progress Report. F. Thomas J uster. Occa- 
sional Paper No. 70. New York: National Bureau of Economic Research, 1959. Pp. 
xviii, 174. $2.50. Paper. 


Harotp W. Warts, Cowles Foundation for Research in Economics 


_ paper presents some preliminary findings of a very interesting study of 
“forward-looking” variables, such as plans and expectations, as predictors of con- 
sumer expenditures on durable goods. If the remaining part of the study fulfills the 
promise this reviewer sees in the present paper, the study will represent a sub- 
stantial contribution to a growing literature on short-term economic forecasting. 
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The goal of the study, as defined in the introduction, is the “ .. . development of 
tools for forecasting. . . .” This choice of objective can be, indeed must be, invoked 
to justify the use of a quite specialized body of data and an eclectic approach to the 
search for relationships. The data are cross-section samples from the membership 
of the Consumers Union of the United Sta’tes (CU). Although this group is not repre- 
sentative of the entire U. S. population it may be true that such a sample is very well 
suited to the study and prediction of durable goods expenditures; the findings of the 
present paper support this notion. The members of the CU are younger, richer, and 
more highly educated than the whole population and for these reasons probably 
represent a large fraction of that portion of the population which is active in the 
durable goods market. It may also include more than its share of the “Joneses” whom 
everyone else “keeps up with.” 

The study is primarily concerned with the predictive ability of the reported buying 
plans of the CU members. The approach taken by the author interprets these plans 
as contingent on the fulfillment of the expectations on which the plans were pre- 
dicated. The analysis then proceeds to explore the relations between various measures 
of the household’s current and expected economic situation and its buying plans. 
These relationships are also compared with those between the household’s economic 
status and outlook and its actual purchases of durable goods. Subsequent analysis, 
which will require re-interview data, will try to find out how plans are changed when 
experience diverges from expectations, and also whether “unplanned” purchases 
can be related to unexpected changes in circumstances. 

In his analysis of the cross-section data, the author has decided against using 
multiple and partial regression analysis and has instead used a much less efficient 
non-parametric technique. The primary reason for his choice is that the computa- 
tions for regression analysis are more time consuming. Although simplicity and com- 
puting ease are relevant criteria, they seem somewhat out of place in the context 
of the recent development of data processing equipment. This reviewer recognizes 
that more and larger regression models are not a sufficient condition for progress and 
he is particularly aware of the temptations to numerical excess which result from the 
use of high-speed machinery. Yet it seems reasonable to expect that a careful use of 
regression analysis, aided by modern equipment, would have made possible some 
reduction from the 54 tables which appear in the text without loss of meaningful 
content. 

Briefly, the paper should appeal to anyone who has an interest in the area of 
short-term economic forecasting. It is informative, it contains valuable data which 
could bear further analysis, and it sharpens one’s interest in seeing the remaining 
part of the study. . 


The International Standardisation of Labour Statistics. International Labour Office. 
Geneva, 1959. Pp. 124. $1.00. Paper. 


GertruveE Bancrort, UV. 8. Department of Labor 


: > is the third edition of a report first published in 1934, which had as its purpose 
the compilation of resolutions adopted by the International Conferences of Labor 
Statisticians and by other international agencies with interests in the subject of 
labor statistics. The present edition provides a description of the general work done 
at the international level for the purpose of standardization, and then summarizes 
the prevailing standards for concepts, definitions, and tabulations in each of the 
following subjects: (1) Major Economic Classifications (industries, occupations, 


_] 
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status); (II) Labor Force, Employment, Unemployment and Underemployment; 
(IIL) Wages, Hours of Work and Labor Income; (IV} Consumer Price Indices; (V) 
Family Living Studies; (VI) International Comparisons of Real Wages; (VII) Social 
Security; (VIII) Industrial Injuries and Occupational Diseases; (IX) Industrial 
Disputes; (X) Collective Agreements; (XI) Migration. For each subject, there is a 
brief history of the steps in the development of the international standards, with a 
list of references to the relevant documents. Altogether, the report is a useful and 
convenient summary for students and labor statisticians. 

With the prodigious expansion in statistical programs since the end of World War 
II, international standards and guidance for statistical programs in developing 
countries have been badly needed. Although any country should properly give first 
priority to its own requirements in designing a statistical system, it nevertheless may 
be helped to meet those requirements by the use of standard concepts, and will 
doubtless profit by the ability to compare its own data with those of other countries. 
It is not necessary to spell out the desirability of uniform standards for the various 
international agencies which not only compile and present the statistics but also 
are called upon to use them. There can be no doubt that the work of the nine Con- 
ferences of Labor Statisticians sponsored by the International Labor Office and 
that of other international bodies has helped to improve the quality of statistics 
as well as their comparability. 

To this reviewer, many of the agreed-upon standards, however, appear to have 
been developed without any regard to the problems of measurement. Some of them 
require detail and precision of information that few, if any, of the most statistically 
advanced countries can provide. For example, the standard definition of unemploy- 
ment can be used only in a country with a population sample survey. It is too com- 
plex for a general population census, and includes categories of persons who would 
not be covered in any other system—social insurance, employment office statistics, 
or trade union statistics. Indeed, it has one or two refinements that would not be 
feasible even in the United States, where the measurement of unemployment has 
become a rather intricate process. One has only to look at any compilation of un- 
employment statistics by country, including the carefully footnoted tables put out 
by the ILO in the Year Book of Labour Statistics, to see how far from uniform they 
are in concept and method of derivation. If the standard is met, it is only in a limited 
way. 

Perhaps the next task of the ILO staff and the statisticians who meet for the Tenth 
and later international conferences should be to undertake analytical studies to de- 
rive comparable statistics in a few critical fields or to measure those segments of the 
problem that can be standardized. These, of course, would be much more difficult 
tasks than, say, developing the definitive, theoretical concept of “invisible under- 
employment,” but at this stage in the world’s history, conceivably more useful. 


Housing Issues in Economic Stabilization Policy. Leo Grebler. (National Bureau of Eco- 
nomic Research, Occasional Paper 72.) New York: National Bureau of Economic Research, 
Inc., 1960. Pp. xi, 129. $1.50. Paper. 


Ricuarp F. Mutu, University of Chicago 


HIS essay is “an appraisal of the 1953-1957 record of government housing credit 
T olicies in relation to general economic stabilization policies.” (p. v.) It is worth- 
while reading for anyone interested in the impact of government housing programs 
and in monetary and fiscal policies for economic stabilization. While Grebler includes 
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some discussion of the merits of selective controls on residential construction in 
mitigating fluctuations in economic activity, his principal contribution lies in his 
analysis of the 1953-58 residential construction cycle. 

His conclusions about the second major boom and subsequent decline in residential 
building in the post-war period can be summarized as follows. These fluctuations re- 
sulted primarily from changes in the supply of mortgage funds brought about by 
changes in other demands for funds. Changes in federal credit policies had but a 
minor influence, coming after the major movements in both the upswing and down- 
swing were well underway and being of minor quantitative significance. Post-war 
fluctuations in home-building have tended to moderate recessionary forces and to 
slow down the pace of expansions, acting as stabilizers though with a lag of from two- 
to three-quarters. 

While I am in complete agreement with Grebler on these major points, certain 
other of his conclusions I find less justifiable. For example, Grebler argues that in 
the boom of 1953-54 financial institutions were “overly zealous” in their mortgage 
lending and that consequently “maladjustments” developed in some housing markets 
in early 1955. His evidence for this, however, I found not at all convincing. Likewise, 
in discussing the impact of FNMA secondary market purchases during 1956-57, he 
correctly points out that not all were necessarily net additions to the supply of 
mortgage funds. This is because FNMA’s actions might induce other mortgage 
holders to shift to other investments. The author’s conclusion that FNMA’s pur- 
chases were a “substantial stabilizing force” even if “reasonable allowance” is made 
for these “side effects” (pp. 84-5) is completely without justification. While he may be 
correct in his judgment, and I believe not, he presents no data or other evidence 
whatever to support his belief. 

Turning to policy questions this essay discusses, Grebler argues that selective 
controls of residential construction can make a positive contribution to economic 
stability. While this may be the case, he ignores, for the most part, the question of 
whether selective controls are necessary or desirable as substitutes or supplements 
for general monetary controls. He presents convincing arguments that it would be 
both impossible and undesirable to stabilize the level of new residential construction 
and that the supposedly “high social priority” of housing is subordinate to the goal 
of economic stability. Finally, Grebler argues in favor of differential selective controls 
in different markets to prevent imbalances. As noted above, however, there is no real 
evidence that any such controls are needed, nor is it obvious that government regula- 
tion could do a better job of allocating resources to housing than the market mech- 
anism. 

Statistically, there is little to comment upon in this book. While it presents a 
great deal of data relating to new construction in the post-war period that is not 
collected in any other volume I am familiar with, it would have been helpful to the 
reader if the author had devoted more space to describing the nature of some of the 
measures presented. Several of the series are from an as yet unpublished manuscript, 
while for many others about the only description given is the name of the compiling 
or publishing agency. In Table 1 data on housing starts are presented which have 
given rise to the currently accepted conclusion that post-war fluctuations in resi- 
dential construction have been largely accounted for by starts under the FHA or VA 
programs. Grebler is one of the few who have called attention to the fact that this is at 
least partly a statistical illusion. The author dismisses the spurious effect as small, 
however, without any attempt to assess its magnitude (p. 95). This failure mars 
Grebler’s generally worthwhile effort. 
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Methodology of the Survey of Consumer Expenditures in 1950, Study of Consumer Ex- 
penditures, Incomes and Savings in the United States. Helen Humes Lamale. Philadel- 
phia, Pennsylvania: Wharton School of Finance and Commerce, University of Pennsyl- 
vania, 1959. Pp. xiv, 359. Price not listed. 


Puitip Goupen, New York City 


sers of the tabulations from the 1950 Survey of Consumer Expenditures pub- 

lished in 1956 and 1957 under the auspices of the Wharton School and the Bureau 
of Labor Statistics will welcome the appearance of the present work. It is written by 
a member of the staff of the Bureau and makes available a much more thorough and 
extensive discussion of the derivation of those data than is given in the prefatory 
portions of the volumes of tabulations. However, although it will serve as a handy 
companion to the tabular data and has been issued as a monograph in a Wharton 
School series of research studies based on them, it stands by itself as the most com- 
plete account put into print of the design and procedures of the 1950 Survey itself. 
A candid and conscientious statement of the reasons for decisions on methods taken 
in the course of devising our most comprehensive official consumer survey, it provides 
a revealing insight into the state of the survey arts and in particular illustrates how 
the unequal advancement of different portions of survey theory may affect practice. 

As things now stand, the construction of the stages of a survey that precede the 
confrontation of interviewer and respondent is relatively clearly charted in the light 
of well-developed theories of sampling, but processes of data collection have been 
explored relatively little in a systematic way. The disparity leads all too many re- 
searchers to occupy themselves with the elegancies of sample design and neat formu- 
lations of sampling error while neglecting faulty response and the less tractable un- 
certainties to which it gives rise. 

This cannot be said of the BLS which, on the record presented in the volume under 
review, experiments with some care before preparing materials and staff. Neverthe- 
less, one is struck by the absence of any kind of broad theory governing the selection 
of schedules for different purposes, the framing of questions, the instruction of in- 
terviewers, and other phases of putting together and conducting the Survey. Deci- 
sions were taken on the merits of empirical evidence pertinent to each matter sep- 
arately without benefit of any attempt to apply relevant theoretical generalizations 
which might have been available from social psychology, linguistic psychology, 
sociology, communications theory, or whatever. There was not the help even of a 
systematic set of concepts covering the structures and processes of survey work. 

In recent years sociologists have been integrating ideas from psychology into a 
theoretical approach to interviewing on attitudes and opinions. Most of their work 
is probably not directly applicable to the problems involved in getting information 
on income and expenditures from mass surveys. Nonetheless it is to be hoped that it 
will be taken as an example and that survey economists will soon undertake the task 
of contriving the conceptual guidelines whose lack becomes so evident as one reads 
the present monograph. 

Only about half of Mrs. Lamale’s text is given over to the description of design and 
procedures. While this half will be of interest primarily to survey statisticians and 
technicians, analysts will be more attracted by the remainder of the book, which is 
devoted to an evaluation of results and to comparisons with data from other sources. 
All serious students of the Survey will have occasion to refer to the appendices, which 
reproduce a set of Survey forms as well as other background material. 


i 
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The Fertilizer Industry: Study of an Imperfect Market. Jesse W. Markham. Nashville, 
Tennessee: Vanderbilt University Press, 1958. Pp. 249. $6.00. 


Eart O. Heavy, Jowa State University 


em MarxkHaM has made an interesting and penetrating analysis of the 
structure of the fertilizer industry. The book is concerned especially with tracing 
the history of the bargaining position of the fertilizer industry, with effects on prices 
of fertilizer inputs to farmers and efficiency in resource allocation. 

This industry study includes five parts: I, the central policy problem and the prob- 
lem of measuring and allocating social costs of market imperfection; II, the organiza- 
tion and structure of the integrated fertilizer industries; III, the organization, struc- 
ture and marketing practices of complementary input industries; IV, pricing, produc- 
tion and distribution practices in the integrated fertilizer industries; V, market im- 
perfections and their measurement and policy implications. 

Markham has made a surprisingly deep analysis of the fertilizer industry, including 
the technology involved and the relation of technical processes to pricing and dis- 
tribution policies of the industry. He also has made a careful analysis of the relation- 
ship of pricing and distribution of fertilizer to the farming industry. He first traces the 
competitive or atomistic structure of the farm industry. Then he indicates how knowl- 
edge and other restrictions restrain the atomistic structure of the farm industry from 
the most efficient allocation of resources in the industry. He then reviews the public 
policy and legislative measures over the last century directed towards maintaining 
the competitive structure of agriculture and improving resource allocation. 

Next he outlines the resource base (phosphate deposits, etc.) on which the fertilizer 
industry is based, with a review of the history of numbers of firms and size of the 
industry. He reviews the bargaining organizations used historically in the fertilizer 
industry to establish fertilizer prices and share the market. The role of American firms 
in international cartels is discussed. The results of oligopolistic actions are outlined 
in detail, especially up through the time the Department of Justice began investigat- 
ing the industry. He provides detailed explanation of the manner in which member- 
ship in export and similar associations entered into and honored exchange agreements, 
both in domestic and foreign business. He also explains how control of devices such as 
shipping and storage facilities and patents for processing methods were used to attain 
particular ends. Examinations of pricing, distribution and market sharing policies 
are made for the phosphate-rock industry, the superphosphate and mixed-fertilizer 
industries, and the nitrogen industry. He examines, at several points, the question of 
“why there is so much sand in the farmer’s fertilizer,” with particular reference to the 
analyses of fertilizers sold to farmers. In this respect, he is concerned with the quan- 
tity of nutrients, a main criterion of the quality of the resource, in fertilizer sold to 
farmers. 

His last section on market imperfections and their measurement and policy impli- 
cations is the most analytical part of the book. This section has general implications 
beyond the particular analysis of the fertilizer industry, although the discussion is 
still couched in terms of the particular industrial sector. His conclusion is that vigor- 
ous anti-trust law administration has held down the social costs of monopoly in the 
industry but has not made all sectors of the industry workably competitive. He 
states that in some sectors a few firms account for most of the output and that monop- 
oly rents remain high. He suggests that this is true because oligopoly is tolerated by 
the Sherman Act and the role of anti-trust activity in respect to the fertilizer industry 
principally serves to prevent any further lessening of competition. 

His final chapters deal with methods of improving resource allocation through les- 
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sening imperfect knowledge and positive government policy. Under the latter, he 
analyzes the role of public research and education in improving technica! knowledge 
and technical processes in fertilizer production and farmer knowledge in the nature 
and use of fertilizers. The public research and education programs discussed include 
those of the Bureau of Plant Industry, the Agricultural Conservation program, the 
Tennessee Valley Authority and Land Grant Colleges. He also indicates a growth in 
competition of some sectors of the industry, resulting from these public efforts and 
the disposal of nitrogen plants after the war. 

While Professor Markham’s book deals with the empirical detail of a particular 
industry, the analysis has important implications for analyses of market structure and 
organization in other sectors of the economy. He is to be commended for the detail 
which he devoted to an analysis of the technical aspects and nature of a particular 
industry. This detail provides a basis for insights into critical elements bearing on 
price and distribution policies. 


The Stages of Economic Growth: A Non-Communist Manifesto. W. W. Rostow. New 
York: Cambridge University Press, 1960. Pp. xi, 179. $1.45. 


Anne Krueger, University of Minnesota 


N THIS short book, Professor Rostow undertakes to present 1) “a theory of eco- 

nomic growth,” 2) “a theory about modern history as a whole,” 3) an explanation 
of how the growth process begins, 4) an account of the relationship of economic 
growth to nationalism, and vice versa, and 5) “an alternative to Karl Marx’ theory 
of modern history.” All this is possible with the now familiar “stages-of-growth” 
analysis. 

In statistical terms, this book has little to offer. The author’s competence in eco- 
nomic history enables him to select from a broad range of historical data those events 
that illustrate most clearly the point he is establishing. Apart, however, from a few 
tables all drawn from other sources (primarily Kuznets, Nutter, Firestone and gov- 
ernment agencies), the book is entirely descriptive, due in part to the paucity of data, 
and partly to Rostow’s approach to his subject. Moreover, none of the “theory” out- 
lined is formulated in a manner amenable to statistical testing. 

It is doubtful whether a theory of growth can be meaningful that is not sufficiently 
general to take into account equilibrium stagnation as one special case of a more gen- 
eral growth model, or, at a minimum, does not account for the economic changes con- 
commitant on departure from the stagnation stage. Yet, in Rostow’s treatment, the 
stagnation stage is not integral to what follows: he dismisses the traditional society 
in one page, since he is “merely clearing the way in order to get at the subject” (p. 6) 
at hand. 

The transition stage, in which the preconditions for growth (a sharp increase in 
agricultural productivity, rising social overhead outlays, emergence of leadership of 
the kind wanting economic development—Cnh. 3) are established, starts, “not endog- 
enously, but from some external intrusion of more advanced societies” (p. 6). The 
first precondition in England was “a kind of statistical accident of history which, 
once having occurred, was irreversible .. .” (p. 31). Although one might agree with 
the specific “preconditions” necessary for further growth, there is nothing in the 
Rostow formulation that explains why these specific events might (or might not) oc- 
cur, save of course that nothing later can happen without their prior occurrence. 
Moreover, it is unclear why the preconditions must occur prior to, rather than con- 
currently with, later developments. 

Once the transition is completed, the country is ready for the “take-off.” Some 
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sharp stimulus, particularly to a new industry, generally initiates this phase—a 
twenty-year period during which investment rises from less than five to more than 
ten per cent of output, a substantial manufacturing sector develops, and a political 
and social framework emerges which “exploits the impulses to expansion in the mod- 
ern sector...” (p. 39). The distinguishing feature of the “take-off” seems to be that 
it is the period during which the rate of growth of output is increasing, concurrently 
with the increase in the rate of investment. Apart from the obvious fact that there 
must be some period during which the rate of growth of output increases as a stagnant 
society starts increasing its output, it is unclear exactly what role, if any, the “take- 
off” plays. Historically, there are cases in which it has taken more than twenty years 
to get more than ten per cent of GNP invested. Presumably, if a country gets through 
the “take-off,” growth will in some sense be self-sustaining. Yet Cuba (1920-40), 
india (1850-80), and other countries have experienced periods exhibiting all the 
characteristics of Rostow’s take-off, and have later undergone sharp drops in the 
growth rate. 

The “drive to maturity” follows automatically after the “take-off” is completed. 
Indeed, the significance of the “take-off” seems to be that it is a necessary (and suffi- 
cient) condition for the later phase. This is the stage during which the “magic of com- 
pound interest” gets built into society’s structure. Growth becomes an “essentially 
biological” phenomenon (p. 36) and it is the changing production functions (the 
“dynamic theory of production”) in one field after another that are responsible for 
self-sustained growth. Despite the author’s focus on the “powerful arithmetic of com- 
pound interest” (pp. 7, 10, 36, 46, 60, 71, 90, 92, 123, 127, 148 and 154) it is question- 
able whether growth is necessarily self-sustaining. Moreover, there is nothing in the 
account of this stage, or for that matter any other, that in any sense explains the de- 
terminants of the rate of growth as between different countries. 

Despite one’s predelictions about the nature of the growth process, one cannot help 

but feel that it is not quite so mechanical and automatic as Rostow would have us 
believe. The “exogenous” nature of the early phases of growth, coupled with the im- 
plied inevitability of the later growth stages, and the lack of any systematic treat- 
ment of the underlying economic relationships render the proffered “theory” unsatis- 
factory. 
_ The most sobering aspect of the book is that economists thus far have been unable 
to do much better. To the extent that Rostow’s stages and methodology produce a 
sufficient irritation, a more promising and fruitful approach to the determinants of 
growth may be forthcoming. 


A Source Book in Mathematics. David Eugene Smith. New York: Dover Publications Inc., 
1959. 2 Volumes. Pp. 701. $1.85 per volume. Paper. 


Rosert H. Oenmke, Michigan State University 


_— volumes are a republication of the First Edition originally published in 1929 
under the auspices of the American Philosophical Association and supported by a 
grant from the Carnegie Corporation of New York. It was the intention of the pro- 
gram that led to this work “to present the most significant passages from the works 
of the most important contributors” to mathematics during the last three or four 


1 That the concept of “take-off” may be quite deceptive, particularly in policy makers’ hands, may be seen 
by the use made of the concept by Indian planners, who have argued that twenty-years of rapid investment will 
be sufficient to generate self-sustained growth in an attempt to justify overly-optimistic investment programs. 
(See the Economist, March 26, 1960, for a summary of this attempt and its explanation in terms of the Rostow 
stages.) 
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centuries. The natural choice for this task, at the time, was one of the outstanding men 
in the activity of the history of mathematics, David E. Smith. 

The material, consisting of some ninety excerpts from the works of sixty-three 
classical contributors to the development of mathematics, is arranged under five 
general headings: Number; Algebra; Geometry; Probability; Calculus, Functions and 
Quaternions. The articles are about equally divided among these categories. The se- 
lection of mathematicians is from the fifteenth through the nineteenth centuries with 
a natural predominance by the nineteenth century mathematicians. Each article is 
preceded by a historical note giving pertinent facts about the author’s life and com- 
menting on the important contribution to mathematics contained in the article. The 
footnotes are frequently used and elucidate the topic still further as well as providing 
references for more detailed study. 

A large percentage of the material covered should be of interest even to a high 
school student. Articles on trigonometry, irrational numbers, the slide rule, etc, 
coincide with material taught in most of our larger high schools. Certain of the ar- 
ticles are entertaining enough to be enjoyed by the general reader. A few of these, 
particularly enjoyed by the reviewer, are: Bernoulli on the Brachistochrone Problem, 
the correspondence of Fermat and Pascal on Probability, several pages from an 
arithmetic by Robert Recorde, and passages from Berkeley’s Analyst. One of the 
criteria used by Smith in his selections was a presentation of material “succinct 
enough for purposes of quotation”; however, in comparison with the style of writing 
in today’s journals, the style of the selected articles is expansive enough to make 
most of the articles “light” reading. 

Of course, in a work of this sort, the selection of topics and mathematicians are 
determined, somewhat, by the author’s taste. Particularly noteworthy, considering 
the recent developments in mathematics, is the omission of articles on axiomatics and 
set theory. Also, even though the table of contents presents one with an impressive 
list of the important contributors to mathematics, the list of missing names is equally 
impressive. The nature of the work itself precluded the inclusion of contributions 
made before the advent of printing and material on the more recent developments in 
mathematics. 

In spite of these omissions, the work still is a most valuable aid to the student of 
mathematics or the history of science and such a readable collection of articles de- 
serves a place in every library. 


Elementary Statistical Methods in Psychology and Education. Paul Blommers and E. F. 
Lindquist. Boston: Houghton- Mifflin Co., 1960. Pp. xv, 528. $5.75. Study Manual $2.00. 


TeRRENCE M. ALLEN, Michigan State University 


MONG recent elementary statistics textbooks there seems to be a trend toward 
A “statistics made easy” or “statistics made interesting.” This book is an exception 
to this trend. The avowed objectives are (1) “to make a relatively few basic statistical 
concepts and techniques genuinely meaningful to the student, through a reasonably 
rigorous developmental treatment,” and (2) to present a treatment “that may be 
readily understood by the student and which will hold his interest.” The first objec- 
tive seems to have been achieved very well, the second not so well. 

Those of us who have exerted great effort in inducing students to unlearn the over- 
simplified ideas they learned from “statistics made easy” textbooks can appreciate 
this book. One topic only is taken up per chapter, and that topic developed thor- 
oughly. For example, a complete chapter on standard scores before introduction of 
the normal distribution helps avoid common student misconceptions on the relation 
between these two concepts. Such an approach enables the authors to achieve very 
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good overall organization of the material presented. To cover the basic descriptive 
statistics and tests and confidence intervals concerning means, medians, and propor- 
tions, 357 pages are used. The remaining 110 pages cover correlation, regression, and 
the corresponding tests and confidence intervals. 

The development of descriptive statistics is unusual only in the length of verbal 
discussion of the fine points and uses of each. For reasons of “personal pedagogical 
preference,” in a “perhaps stubborn resistance to trend in statistical books,” the 
variance of a sample is defined with N instead of N-1 in the denominator. To avoid 
conflict with standard modern usage, 8* (the final German ess) is used as the corre- 
sponding symbol, and @? is later introduced as the unbiased estimate. Failure to then 
abandon 8 leads to clumsiness and confusion. 

Instead of the usual treatment proceeding from basic probability to the binomial 
distribution and the normal distribution as its limiting form, the normal curve is 
introduced before mention of the word probability. The binomial distribution and 
multiplication law for probabilities are not mentioned, and the addition law is 
loosely stated only as needed for adding areas under the normal curve. Non-para- 
metric methods are not mentioned, and proportions are treated only as continuous 
variables in the large-sample case. This approach can be defended, however, on the 
ground that these topics should be deleted until they can be taught properly in a 
more advanced course. 

It is the chapters in the middle of the book on sampling, hypothesis testing, and 
interval estimation that are worthy of notice. We have here a sound elementary 
treatment of basic sampling theory and sampling distributions. Hypothesis testing 
is then excellently introduced as a form of indirect proof, and the formal system of 
statistical inference is explained with much greater accuracy than we have learned to 
expect from elementary textbooks. Types of error, power of a test, and interval esti- 
mation are covered similarly. The chapter on the t-statistic, its sampling distribution 
and its uses makes clear that the t-test takes account of the sampling error involved 
in the variance estimate (aithough 8 still clouds the picture). In this reviewer’s opin- 
ion, this is the most accurate treatment of these topics of any text intended for an 
’ introductory course for students in psychology and education. 

The final three chapters cover correlation, regression, and the corresponding sig- 
nificance tests and interval estimates. The explanation of correlation and regression is 
about as expected (although we are still plagued with 8), but the final chapter is again 
more accurate than usual. The book has less than its share of inaccuracies, and its 
idiosyncrasies (such as the prominence given to classical large-sample tests of me- 
dians, and an overemphasis on educational and psychological tests) do not seem to 
be serious shortcomings. 

The book is disappointing with respect to its second objective because it is not 
readily understandable to the student and does not hold his interest. Although the 
level of mathematical sophistication is hardly above that of the usual textbook, the 
level of verbal sophistication is quite out of the usual range. The opportunity to use 
this text for part of a term has made it clear to me that students find the exposition | 
difficult to understand, and have even more difficulty maintaining their motivation for 
the study of it. They become so bogged down in details of the involved explanations 
that they miss the main points of the chapter. Many examples employ subject matter 
about which students know little and care less. Although the “folksy touch” of some 
recent tests can be cloying, this book could be greatly improved by some human inter- 
est and plain talk. 

Although this book is a new presentation rather than a revision of Lindquist’s A 
First Course in Statistics (Houghton-Mifflin 1938 and 1942), the study manual has 
borrowed heavily from its predecessor. The result is a workbook well-integrated with 
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the text, and, for a change, one which can be really called a supplement to the text. 

Given highly-motivated students with a high level of verbal sophistication, and an 
instructor able to keep the overall picture before them, this book should give a sound 
basic introduction to the statistics used in education and psychology. In addition, the 
students will learn relatively little that is not true. , 


Statistical Analysis. Edward C. Bryant. New York: McGraw-Hill, 1960. Pp. 303. $6.50. 
Frank Metssner, San Jose State College 


HIs is a basic text dealing with application of statistical methods to managerial 

decision making. Bryant competently covers the standard topics, and illustrates 
them with many amusing examples. The following “bonus” topics have not previously 
been covered in a manner that would be suitable for undergraduate instruction: 
workable computing schemes for multiple regression and correlation, Markov 
theorem on least squares, analysis of variance theory, and queuing and inventory 
problems. 

The strength of the book is in its healthy differentiation between mathematics and 
statistics on the one hand, description and inference on the other. Bryant shows 
clearly how statistical theory depends upon mathematics. The more mathematics 
one knows, the better a statistician one can become. But this does not mean that the 
two disciplines are identical. Mathematics is an exact and abstract science leading to 

-logical statements, equations, or values; while statistics addresses itself to reality by 
attempting to evaluate scientific endeavor—its activities result in probabilistic state- 
ments, equations, or values. The nature of the universe is not mathematical but sta- 
tistical. Since Americans are empirically minded, it should be far easier to put across 
statistics than mathematics. 

Bryant reveals his bias in the matter of the false ambivalence between descriptive 
(tabulating, graphing, and summarizing data by means of measures of central tend- 
ency and variability) and inferential (procedures and techniques used in testing 
hypotheses and making inferences from data) statistics. Descriptive statistics are an 
indispensable tool for performing the acts of inference. However, we gain very little in 
life through mere quantification of attributes of our environment. Yet, we profit a 
great deal if we can make reliable and valid inferences from quantitative data. This 
distinction alone should be a helpful piece of insight that a student can derive from 
this excellent text. 


American Marriage and Divorce. Paul H. Jacobson. New York: Rinehart and Company, 
Inc., 1959. Pp. xviii, 188. $12.00. 


Davip M. Herr, Bureau of the Census 


R. Jacosson’s book is designed to fill a sizable gap in the published data con- 

cerning marriage and divorce in the United States. This gap resulted from the 
system by which the U. 8. Government has collected statistics on marriage and di- 
vorce, which has not provided statistics for the Nation as a whole. Instead, detailed 
statistics have been published regularly at most only for those States within marriage 
and divorce registration areas. For example, data on the number of first marriages 
are available annually for only 24 States and New York State, excluding New York 
City. 

Dr. Jacobson’s book is a valiant and prodigious attempt to derive nationwide 
statistics on marriage and divorce from the incomplete data published by the Na- 
tional Office of Vital Statistics, data collected by the Bureau of the Census in the 
Decennial Censuses and the Current Population Survey, and supplementary data 
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obtained by the author from State registrars and many other sources. In the main, 
this endeavor has produced worthy results. The more methodologically oriented 
reader will have wished that Jacobson had given more detail on how he arrived at his 
estimates, but it may be that the publisher did not care to print in fine detail the in- 
tricate estimating procedures which Jacobson probably employed. Jacobson would 
have rendered a service, under the circumstances, if he had more frequently warned 
the reader on how to distinguish the more reliable estimates from the less reliable. 

In the remainder of this review, attention will be devoted mainly to those data 
and measures which, in the reviewer’s opinion, are least reliable. The reviewer thinks 
this course will maximize his contribution to the reading public. Such emphasis should 
not be interpreted to mean that the reviewer has doubts concerning the general qual- 
ity of Jacobson’s work. For the most part, the data presented are of high quality 
and will prove invaluable to many researchers and practitioners working in the area 
of marriage, divorce, and family relations. 

The introductory chapter gives the history of the collection of statistics on marital 
status by the U. 8. Bureau of the Census and describes the efforts of the Bureau of 
the Census, and later of the National Office of Vital Statistics, to collect data on mar- 
riage and divorce from the States. Also in this chapter Jacobson discusses the possible 
underreporting of divorced (but not remarried) persons in the Decennial Censuses 
and attempts to estimate, by a components-of-change method, the “true” number of 
persons in each marital status category at the time of each census from 1900 to 1950. 
Utilizing this method, Jacobson finds the cumulative shortage in the reported number 
of divorced persons by 1950 to be three times as large as the reported number. The 
true number, however, produces an impossibly low death rate for divorced persons. 
Therefore, he employs a second method of estimating the true number of divorced 
persons. Using this second method, he concludes that the true number of divorced 
persons in 1950 was only 26 per cent greater than the number reported by the Bureau 
of the Census. This second method, however, involves a circular computation of the 
annual number of deaths and remarriages of divorced persons using estimated death 
and remarriage rates of divorced persons in which the denominator of the rate is the 
number of persons reported divorced by the Bureau of the Census. If the censuses 
have underestimated the true number of divorced persons, then the death and re- 
marriage rates of divorced persons are inflated and subsequent estimates of the num- 
ber of persons divorced but not remarried are too low. In view of the inaccuracies in 
the estimates of remarriages or deaths of divorced persons, it may be best to conclude 
that only a broad range of probable estimates of the true number of divorced persons 
can be made from the presently available data. 

Chapter 1, entitled “The Trend of Marriage,” provides an historical series on mar- 
riages and marriage rates from 1860 to 1956. The number of marriages for 1860 to 
1866 have been estimated by the author from fairly complete records of marriage in 
the Northern States but extremely sketchy records from the Southern States. Mar- 
riage rates per one thousand marriage eligibles (single, widowed, or divorced persons) 
are also computed from 1860 to 1956. The accuracy of the rates for the years from 
1860 to 1890 are subject to whatever weaknesses exist in the estimated number of 
marriages prior to 1867 and in the estimates of the population by marital status for 
years prior to 1890, the first date for which the U. 8S. Census published data by marital 
status. This chapter contains a table on monthly marriage rates from 1917 to 1955 
which should prove very useful to researchers interested in the effect of external 
events on marriage. From these data Jacobson provides an excellent discussion of the 
effects of the business cycle and war on the marriage rate. 

Chapters 2, 3, and 4 present interesting discussions of the tltening taghene The 
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seasonal pattern of marriage, variations among the States in marriage rates and in 
marriage law, and factors influencing the proportion of marriages performed by re- 
ligious rather than civil ceremony. Chapter 5 contains an informative discussion of 
differentials in the pattern of marriage with regard t6 race, age, and previous marital 
status. One feature of this chapter is a table (Table 33) showing marriage rates by 
age, sex, and previous marital status. According to this table, at each age group 
divorced men and women have the highest marriage rate. Since the denominator of 
the rate is the unadjusted number of persons reported as divorced, the amount of the 
differences in marriage rates by marital status is probably exaggerated. However, 
at young ages some mothers of illegitimate children in the numerator who claim to be 
remarrying after having received a divorce may actually have never married. A true 
remarriage probability for divorced persons would be computed with a base compris- 
ing the total number of divorced persons exposed to the risk of remarriage during 
the year—including not only the midyear divorced population (the conventional 
base for such a rate and the base used by Jacobson) but also divorced persons who 
remarried in the first half of the year plus those who become divorced in the remaining 
half of the year. Ideally, both the numerator and denominator of such a measure 
should be adjusted for errors of reporting. 

Chapter 6 discusses the probability of first marriage and of remarriage. The prob- 
ability of first marriage for single males and females at each single year of age is pre- 
sented according to 1948 and 1940 age-specific marriage rates. The nuptiality tables 
based on the 1948 rates are the most recent ones available as a predictor, but they 
are subject to certain weaknesses, since 1948 was a year in which the age at first mar- 
riage was on the decline and, hence, a year in which age-specific marriage rates were 
higher than they wouid have been had the age at marriage been constant. According 
to the nuptiality table based on 1948 age-specific marriage rates, the chances of 
eventual marriage reach the almost unbelievably high level of 98.1 per cent for fe- 
males at age 16 and 97.1 per cent for males at age 19. Jacobson also presents nuptial- 
ity tables for widowed and divorced men and women. For reasons cited above, the 
probabilities of eventual remarriage for divorced persons shown in these tables are 
very likely to be exaggerated, but they are the most current ones available. 

Chapters 7 through 10 concern divorce. Chapter 7 shows the historical trend of di- 
vorce in the United States and relates the divorce rate to wars and the state of the 
economy. Chapter 8 discusses geographic and demographic variations in the divorce 
rate and the phenomenon of migratory divorce. In this chapter Jacobson shows that 
for two areas of the country where data are available, Roman Catholics account for 
an appreciable proportion of divorces granted; furthermore, from data available for 
a small number of States, Jacobson deduces that, in 1950 and the years immediately 
preceding, Negro males probably had a higher divorce rate than white males. Chapter 
9 discusses existing divorce proceedings and their historic development, with atten- 
tion given to the legal grounds for divorce and the party to whom granted. Chapter 
10 discusses the interesting question of children in divorce. Making use of data ob- 
tained from the 1948 Current Population Survey, Jacobson estimates divorce rates 
according to duration of marriage depending on whether or not the couple have 
children. He concludes that until the marriage has lasted long enough for all of the 
couple’s children to have at least reached adolescence, those with no children are 
more likely to experience divorce than those with children. Jacobson does not posit a 
necessary causal relationship between childless marriages and divorce; he holds that 
both childlessness and divorce often stem from “more fundamental factors in the 
marital relationship.” 

The final chapter deals with the duration of marriage and the relative frequency 
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with which marriages are dissolved by death or divorce. A duration of marriage table 
based on 1948 duration-specific divorce rates shows the probability of a marriage 
eventually ending in divorce as 29.1 per cent. The same table based on 1955 duration- 
specific divorce rates shows the proportion ending in divorce as 24.9 per cent. Compu- 
tation of true probabilities that a marriage will end in divorce necessitates the use of a 
cohort method rather than the cross-sectional method employed by Jacobson. How- 
ever, Jacobson’s computations provide a useful approximation to what might be 
found by the latter approach. 

Steps are being taken by the National Office of Vital Statistics to expand the 
marriage and divorce registration areas until they cover all of the States. The urgency 
‘of this movement for the improvement of marriage and divorce statistics is recognized 
by all students of this field. As a promising interim measure, sample surveys are 
being initiated by NOVS to supply figures on these subjects for States not now in the 
registration areas. Meantime, Dr. Jacobson has rendered a distinct service in his con- 
scientious attempt to provide historical depth for the marriage and divorce figures 
that are now being collected and in applying the tools of the actuary to the statistics 
he has tediously assembled over many years. 


Nomography. (Second Edition) A. S. Levens. New York, New York: John Wiley and Sons, 
1959. Pp. vii, 296. $8.50. 


H. O. Hartiey, Jowa State University 


 aaioaaie are graphical] representations of functional relationships of the form 


y = f(a, ’ Zn) (1) 
in which the user enters the scales corresponding to the “given” or “input” readings 
2, °**, and after some graphical manipulation obtains the “solution” or 


“response” reading y on one of the scales of the device. Compared with a digital tabu- 
lation of f the method is often preferred when only two or three significant figures are 
required, particularly by those (such as the engineers and physical scientists) who 
are well accustomed to graphical methods and scale readings. In statistical method- 
ology the use of nomographs lags behind that of tables, possibly because the poten- 
tialities of these devices are often not fully appreciated. The power of the method for 
representing a multivariate relationship such as (1) is that under special circum- 
stances nomographical representation is vastly more economical than the printing 
of a multivariate table. As an example of such a a aera situation consider a relation 
between y and z-, z2 given by 


+ foly) = foly) (2) 


Chapter 5 is devoted to this particular situation and provides so called Z-charts (a 
special case of alignment charts) for such relationships which are of amazing sim- 
plicity when compared with a double entry table of y as a function of 2 and 22. 
Chapters 3 to 10 are in fact devoted each to a similar “special case” of functional re- 
lationship. This arrangement (by the form of the functional relationship) is most use- 
ful to the user in each particular branch of science since this will be his starting point 
and he is thereby conveniently guided to the devices appropriate to his problem. 
It is true that usually the hardest part of the task is to put the given scientific rela- 
tionship into a form in which nomographical treatment is possible. The examples 
given at the end of each chapter (mainly from the engineering sciences) are of com- 
paratively simple nature. With statistical relationships such a special functional form 
can sometimes be obtained only if certain approximations in the original general 
form (1) are made. When the special form has been selected there still remain the 
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- details of convenient graphical representation such as the choice of device, scale and 
, unit and here this book is full of sound advice. Indeed the last six chapters (chapters 
11 to 16) deal with the graphical form of the device and other aids to graphical repre- 
sentation. Of particular interest is the application of what is called the “Duality 
Principle” to the transformation of “concurrency charts” to alignment charts (chapter 
16) illustrating this method by an example which most clearly demonstrates the gain 
in precision and clarity of presentation. 

Although this book has originated from a treatise for scientists mainly in the phys- 
ical and engineering sciences, statisticians who are paying increasing attention to the 
use of these tools would do well to consult this book, even if they will not find much 
in it that will help them with the problem of constructing nomographs directly from 
empirical data. 

The Literature of the Social Sciences. Peter R. Lewis. London, England: The Library 
Association, 1960. Pp. xx, 222. $4.20. 


Witu1am G. Mapow, Stanford Research Institute 


_— the Preface, “This book deals generally with literature and resources avail- 
able for the study of the social sciences and their history from about 1800 onwards 
and concentrates particularly on the twentieth century. Theoretical aspects have 
been given due weight, but I have tried to demonstrate more emphatically the prac- 
tical side of the social sciences and to consider especially the interests of British read- 
ers, in the belief that this will increase the value of the book to the administrator, the 
businessman, the research worker, and the general reader. For this reason, problems 
of library administration have been included as factors of interest to those who use 
libraries as much as do those who run them. The closing ‘date’ for the consideration 
of new publications was set at 31 December 1958, but details of new editions, etc., of 
works already included have been incorporated as they became known up to the end 
of February 1959.” 

“A work of this size cannot hope to become comprehensive and is bound to be 
somewhat superficial: the literature has grown to such dimensions that it would re- 
quire a number of volumes to account in detail for the many facets, subjects and ree 
sources that constitute the enormous field of knowledge known as the social sciences. 
I do not claim even to have touched on all these aspects in passing, although I have 
attempted to account for at least the most prominent and influential of them. One of 
the purposes for which the book was written was to draw the attention of others with 
more specialized qualifications than mine to the areas in which further detailed stud- 
ies are most needed.” 

This reviewer has approached this volume from the point of view of a person trying 
to look up material in areas with which he is unfamiliar. The general impression is 
that although this book may serve as a starting place, it will not provide complete or 
even almost complete references to the subject matter as it exists in the present day. 
On the other hand, it is a comforting book, stating some of the problems that will be 
met within libraries and also commenting briefly on many of the books it discusses. 
In other words, to use this book to build up a complete bibliography or even to pick 
out the most salient books in any area would probably be unwise. To use it as a start- 
ing point or as a book to which one occasionally refers in connection with a more de- 
tailed bibliographic search would probably be very helpful. 

Finally, although law is included as a major category, psychology is omitted. For 
this and other reasons, the content of the book is not really the social sciences as un- 
derstood in the United States. 
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Indian Currency. J. \/. Kapoor. Allahabad, India: Kitab Mahal, 1959. Pp. vi, 229. Rs. 
6.90 (Foreign, $1.50). 


Anprew F. Brimmer, Michigan State University 


ESPITE the title of this book, the author ranges widely over a number of subjects 
besides currency, including monetary theory, central banking policy, partition 
of India and rehabilitation of refugees, international trade, and economic planning. 
The book has been written primarily for undergraduates preparing to take examina- 
tions for the bachelor’s degree in commerce or for certification by the Indian Institute 
of Bankers and Chartered Accountants. Thus, it is likely to be of only slight interest 
to mature students of monetary economics or to specialists ia Indian affairs. 
The subject matter is treated historically in two main parts, the partition of India 
. in 1947 being the dividing line. The bulk of Part II consists of a reprint of a pamphlet 
by the author dealing with India’s Five Year Plans. (The author believes that India 
made a fundamental error toward the end of the First Plan and maintained it through 
the Second Plan when emphasis was shifted from agriculture to industrialization). 
The author’s treatment of economic planning in India does not contribute anything 
new to the general understanding of the planning process and is barely adequate as 
an exposition of the main features of the Plans. 

Part I of the book shows clearly that the history of Indian currency before World 
War I was primarily the history of efforts to maintain a stable sterling-rupee exchange 
rate (1s. 4d:Rs.1) after the United Kingdom and most of India’s other trading part- 
ners adopted the gold standard while silver remained the basis of India’s currency. 
Although a few officials (including Lord Keynes who visited India in 1913 as a mem- 

ber of the Chamberlain Commission) clearly appreciated India’s predicament in hav- 
ing its economy tied to silver which was experiencing a long price downtrend, the issue 
was not resolved until the Reserve Bank of India was established in 1935. 

However, with the passing of the exchange rate issue, the distortions imposed on 
the financial system by World War II led to new problems of inflation, exchange con- 
trols, and the accumulation of blocked sterling balances. After the end of the war, 
and before the legacy of war-time controls could be liquidated, the partition intro- 
duced other strains—subsequently aggrevated by the depreciation of the rupee along 
with sterling in 1949. In the last decade, monetary conditions in India have been 
dominated primarily by the execution of the Five Year Plans, leading to considerable 
suppressed (and more recently to open) inflation. Consequently, monetary controls 
have been secondary, and major reliance has been on a variety of direct measures to 
restrict the availability of credit. However, in April, 1957, India undertook a purely 
monetary reform by adopting the decimal currency system to replace the former 
fractional coins consisting of 16 annas in a rupee or 64 pice or 192 pies. Although there 
was considerable opposition to the move on the part of small merchants, the new sys- 
tem seems to have caught on rather quickly. 


The New Inflation. Willard L. Thorp and Richard Quandt. New York: McGraw-Hill Book 
Company, 1959. Pp. xi, 233. $5.00. 


Tuomas Mayer, Michigan State University 


His book is an outgrowth of a conference of twenty-three economists who met at 
the Merrill Center for Economics in 1958. It is, however, not a summary of the 
conclusions reached by the conferees, but represents the reactions of its two authors. 
It is addressed to the general reader and does not assume any knowledge of economics. 
As such it is a brilliant job of exposition. The authors present, in easy language, ideas 
which we usually teach only in advanced courses. For example, the reader is told in 
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simple terms about the index number bias, the Pigou effect, the Graham plan, the 
“bills only” doctrine and Nurkse’s emulation effect in underdeveloped countries. 
Thus while eschewing diagrams and rigorous formulation, the book does not leave 
out any important ideas because they are too difficult to explain to the layman—it is 
the closest to “economics without tears” which I have ever seen. And even the econ- 
omist who had kept up his reading is likely to find a few new and interesting nuances. 

The book begins with a look at the historical record, and then takes up demand- 
pull inflation in both its quantity theory and income-expenditure versions. Unfor- 
tunately the quantity theory used looks more like the standard version of Money and 
Banking textbooks than the modern Chicago version. The next chapter explains the 
cost-push theory of inflation. Having described the two possible causes of inflation, 
the authors discuss possible checks on inflation and the interaction of economic 
growth and inflation. This is followed by an excellent chapter on the international 
aspects of inflation and the impact of inflation on underdeveloped countries. 

Thorp and Quandt then deal with policies to prevent inflation and are essentially 
quite pessimistic. Monetary and fiscal policy are “fairly” effective when used against 
demand-pull inflation, and should be used against such an inflation except in cases 
where they would cause serious unemployment or would interfere with economic 
growth. Against cost-push inflation, however, monetary and fiscal policy are probably 
not sufficient: because they could stop such an inflation only at the cost of substantial 
unemployment. Price control is also rejected, but the authors have some faith in 
policies to increase factor mobility, raise productivity, and set up countervailing 
power. (But to this reviewer this sounds a bit like saying that although we cannot 
match the Russians in missiles we should at least produce more bows and arrows 
than they do.) After this pessimistic appraisal the authors conclude that creeping in- 
flation, both of the cost-push and demand-pull variety is inevitable, but that fortu- 
nately creeping inflation is by no means a catastrophic evil, though it does have some 
bad effects. 

Thus, the book provides a good summary of professional opinion on the inflation 
topic, and it is not the fault of Thorp and Quandt that the profession cannot feel 
proud of its insight into this problem. While the reader is not given a well explicated 
theory, he does obtain much more information than I had thought possible to present 
in simple terms within 228 pages. 

There are, however, a few minor slips. The authors seem unclear about the dis- 
tinction between the Pigou and Keynes effects (and also relate the Pigou effect to 
cash rather than to the net indebtedness of the government), and while explaining the 
specie flow mechanism, they do not mention the income version of the gold standard 
mechanism. Moreover, the reader is led to believe that banks discount commercial 
paper with the Federal Reserve rather than borrow on their own notes. But these are 
minor blemishes on an excellent job. 


Management Technology. Roger R. Crane and C. West Churchman, Editors. Pleasantville, 
New York: Monograph No. 1, The Institute of Management Sciences, 1960. Pp. 91. 
$2.00. 


Joun F. Mourn, Carnegie Institute of Technology 


ost professional societies have, at some time or other, felt the need to reach a 
larger audience than the regular readers of technical journals. Many ways of 
meeting this objective have been tried, but few have been successful. As the first of 
a projected series of monographs on management science intended for administrators, 
Management Technology will not, I think, spoil the record. 
According ‘to the editors’ foreword, the objective of the monograph series is to 
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publish three kinds of articles of interest to operating management: “(1) case histories 
of the applications of management science to management problems, (2) papers on the 
problems of applying new developments in management sciences to business systems, 
and (3) management discussions of management sciences, such as the nature of bus- 
iness problems which they feel are important and can be attacked by management 
sciences.” Solid case material is difficult to obtain, especially with data not badly 
mangled by the disguising process and with a clear description of what really hap- 
pened in the firm after an alleged improvement had been made. 

It is hardly surprising that of the eleven papers in the monograph, only four could 
possibly be construed as case histories of applications and none refer in any detectable 
way with problems of applications. Most of the papers are expository discussions of 
techniques which may be applicable to business probiems. 

In addition to two short presentations from a panel discussion entitied “Operating 
Management Speaks,” the articles are concerned with linear programming, systems 
engineering, simulation of waiting-lines, and technological measurement. Of the four 
articles on linear programming, that by K. Eisemann and W. M. Young should be 
most interesting to management because it emphasizes using linear programming to 
show where profitability can be increased by modifying restrictions imposed in the 
original problem. 

Three additional articles are related to systems engineering. The paper by M. M. 
Flood is a rhapsodical introduction to the field, while that by D. 8. Stoller and R. Van 
Horn goes further in describing three design approaches for information systems. J. P. 
Hyland suggests that a major advantage of quality control is its tendency to improve 
the dynamic stability of the process. This will sound silly to most statisticians, but 
maybe he is right. 

The article on “Monte Carlo Solution of Waiting Line Problems,” by B. E. Goetz, 
is an admirably clear—if somewhat sanguine—description of simulation. The as- 
sumptions and data requirements are made explicit in terms meaningful to manage- 
ment. In his only reference to statistics, however, he asserts: “The adequacy of the 
length of the simulation period can be checked statistically.” Because successive ob- 
servations are not independent in Monte Carlo simulations, the significance level of 
any differences will be overstated by an unknown amount, unless special techniques 
are devised. 

The article by S. B. Littauer, “On Some Aspects of Technological Measurement,” 
tortuously makes the point that the variability among samples of industrial data 
(e.g. job times) tends to be much larger than one would expect on the basis of ~ 
within-sample variance. 

The other papers ignore uncertainty in parameter estimates, possibly because it is 
regarded as of secondary importance. This sampling of management science suggests 
that more effective use could be made of existing statistical techniques, but it also 
points out a need for methods more relevant to business problems. 


An Introduction to Modern Mathematics. Robert W. Sloan. Englewood Cliffs, N. J.: 
Prentice-Hall, Inc., 1960. Pp. xi, 73. $3.75. 


Witson M. Zarina, University of Illinois 


ROM this book, a reader only slightly familiar with high school mathematics can, 
in a very brief time, gain an understanding of such terms as set, union, intersec- 
tion, solution set, open sentence, variable, conjunction, disjunction, quantifier, car- 
tesian product, relation, function, and others. These ideas are set forth with great 
clarity. They are frequently motivated with very graphic illustrations. The interested 
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reader who wishes to secure a more “modern” vocabulary should find this book of 
great interest. It is both readable and concise. 

However, the reader who wishes to learn more than just the vocabulary should 
look elsewhere. There are no theorems with proofs. While the author discusses briefly 
axioms and the use of logic in deducing conclusions from axioms, no examples are 
given. In this regard, the book is most disappointing. Indeed, it is my feeling that this 
is not consistent with the announced objectives of the author. 

He states in the preface, “This book has as its major purpose the development of 
elementary mathematics from a modern point of view.” Since it does not contain a 
single theorem with proof, it is my opinion that it is not a “development of.” In this 
respect, an excellent opportunity was missed in Chapter 3. The events up to that 
point were the following. 

Chapter 1 (9 pages) begins with a very brief discussion of “modern mathematics” 
and a statement of the purposes of the book. This is followed by an excellent discus- 
sion of the distinction between names and the thing named. The chapter ends with a 
discussion of standard sets for counting and the Arabic numerals. 

Chapter 2 (17 pages) is concerned with such concepts as set, subset, intersection, 
union, complement, universal set, and the empty set. This is followed by a discussion 
of variables, sentences, conjunction, disjunction, negation, the conditional, the bi- 
conditional, tautologies, quantifiers, axioms and axiom systems. 

Chapter 3 (13 pages) begins with a statement of the field postulates. The author 
then remarks, “deriving theorems abstractly from a list of axioms is sometimes quite 
difficult; so, we will start, not from the axioms, but from what we hope is a model of 
the axioms.” He then discusses the number line in a traditional heuristic way. This | 
found to be an anticlimax whose depth is reached when the author appeals to Hankel’s 

“Principle of Permanence of Formal Laws” to justify (—2) (—1) =(+2). Since the 
author chooses to discuss the laws of signs, the announced objectives of the book and 
the development to this point cry out for a theorem. 
The remainder of the chapter deals with the arithmetic of directed numbers, order- 
ing, absolute value, equations, inequalities, solution sets and their graphs. 

In Chapter 4 (13 pages) ordered pairs, cartesian products, lines and linear equa- 
tions, relations and functions are defined and discussed briefly. One page is devoted 
to illustrating the use of set notation in solving a system of two linear equations in two 
unknowns. This is the only example in the book that gives the reader an idea as to 
how the concepts previously defined can be put to use. 
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The Public and the Programmes: A Report on an Audience Research Enquiry. British 
Broadcasting Company, 1959. Pp. 71. Eight shillings and sixpence. Paper. 


A. L. Finxner, Research Triangle Institute 


pee manuscript reports the results of a special survey of the television and radio 
audience in Great Britain conducted in the latter part of 1958 by the British 
Broadcasting Company’s Audience Research Department. This special enquiry was 
designed to supplement their continuous “Survey of Listening and Viewing.” 

The report is divided into five chapters: The Public, Listening in the Evening, View- 
ing in the Evening, Listening in the Daytime, and The Tastes of Listeners and View- 
ers. It also contains a summary entitled “Highlights and Reflections” and an appendix 
describing the sampling plan and field work results. The report is principally the pre- 
sentation of some 58 tables describing listening and viewing habits with an attempt to 
relate these habits to various characteristics of the audience. 

The authors indicate that the sampling plan was multistage in nature with the 
frame being a list of names and addresses of electors. The non-response rate for the 
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survey was approximately 30% even though the personal interview method of con- 

tact was used. The authors state, 
“The extent to which failure to interview some of the persons listed impairs the repre- 
sentativeness of those who were interviewed depends upon the nature of the relation- 
ship, if any, between the cause of failure and the question being asked. In some cases 
there is no prima facie reason why such relationship should exist, and hence no reason 
to suppose that the inclusion of these persons would have affected the results. Even 
in those cases where replies might have been differently distributed, it seems unlikely 
that their inclusion would so have altered the figures given as to have led to a major 
modification of any of the conclusions in the report.” 


To the extent that their judgment is correct, the results are valid. But, as is well 
known, it is hazardous to make these assumptions without some evidence from the 
non-response group. A non-response rate of 30% is rather high for most personal inter- 
view type surveys. 

One of the major conclusions drawn from the study is that the less a person listens 
or views, the more likely he is to be selective in his preferences. In most cases, the evi- 
dence presented seems to confirm expected behavior. It is interesting to note that the 
“tastes” of listening audiences, that is, the distribution of their expressed preferences 
for various types of programs, changed little between 1943 and 1958. 

Although Mr. R. J. E. Silvey, Head of Audience Research, BBC, stated that the 
report is now published in the belief that it contains information of general interest, 
this reviewer doubts that it will have general appeal. There may be items of interest 
to the general public but these are lost in the mass of detail presented. There is noth- 
ing unique, methodologically, to attract 4 sampling statistician. Some social scientists 
might be interested in the types of questions asked. The generalizations drawn from 
the factual data can not be extended to other countries, and, of course, the authors 
do not imply that it should. It appears that the greatest value of the report will be to 
the personnel in BBC who are charged with the responsibility of producing programs 
which cater to the public sampled. This, of course, was the original and primary pur- 
pose of the survey. 


Analytic Trigonometry. Paul S. Mostert. Englewood Cliffs, N. J.: Prentice-Hall, Inc., 
1960. Pp. x, 166. $3.95. 
| eee the Preface, “This book provides an analytic treatment of Trigonometry, 

with the function concept as basic to the development of the subject, and with the 
trigonometric functions as functions of real numbers being the principal objects of 
study... .” 

“The treatment is somewhat shorter than standard texts in trigonometry. For that 
reason, and because of the analytic approach, the student will probably find it a 
little more challenging. However, it is still well within the capabilities of the average 
student, and should prove a more valuable investment of his time. 

“The treatment is also a considerable departure from the standard presentation. 
The reason for this is that the traditional treatment is simply not what the student of 
either mathematics of the sciences needs. This text represents an attempt to give the 
reader what he needs in an amount of time commensurate with the value of the sub- 
ject which could be as little as one-third the time traditionally allotted to trigonom- 
etry. At the same time it will give him a better preparation for trigonometry’s most 
important uses.” 

There are a good many exercises, the answers to many of which are given. 

This reader would be a little happier with the book if the exercises and chapters 
contained more intuitive and motivating material. W.G.M 
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Quality Control and Applied Statistics Yearbook 1959. Robert S. Titchen, Arnold J. 
Rosenthal, Bruce Bollerman, and Frank Nistico, Editors. New York: Interscience Pub- 
lishers, 1960. Pp. 1348. $60. 


“rP\ueEse abstracts were originally published as “Quality Control and Applied 
Statistics Abstract Service,” Vol. IV, Issues 1 through 12, 1959.” 

The Preface states, “This volume represents a collection of abstracts of papers 
dealing with quality control and applied statistics. The material was originally pub- 
lished in 1959 in the outstanding journals in this and foreign countries, and therefore 
constitutes documentary evidence of progress in these fields throughout the world. 

“It should be noted that although the information is presented in the forms of 
abstracts, these abstracts differ from those conventionally used. They are far more 
complete and generally contain original data, graphs, charts, photographs, etc., so 
that in many cases workers will not find it necessary to refer to the original work. The 
abstracts were published originally in looseleaf form. The bound and indexed volume 
will make them still more accessible in libraries for reference purposes.” 

These abstracts are exceedingly helpful. The only criticism of the present volume 
is that it does not contain the tables of contents that appear in the monthly collec- 
tions of abstracts in the serial service. There are many more abstracts than would 
appear, for example in Mathematical Reviews and it is still to be seen whether the 
new International Statistical Institute Abstract Journal will be more helpful. The 
scope of the abstracts includes all applied statistics‘and is not limited to the parts of 
applied statistics most directly related to quality control. Also, papers on operations 
research are abstracted in this series. 

Thus this reviewer tends to believe that a more correct description of the contents 
would be obtained by changing the title to Applied Statistics, Operations Research, 
and Quality Control rather than giving to quality control the major position in the 
title. It would also seem desirable to present a reviewer index as well as a more de- 
tailed subject index in addition to the author and subject index that now appear. 

To illustrate the usual contents of the series, let us cite the major headings for a 
single month: Statistical Process Control, Sampling Principles and Plans, Manage- 
ment of Quality Control and Reliability, Mathematical Statistics and Probability 
Theory, Experimentation and Correlation, Managerial Applications, Measurement 
and Control, Reliability of Complex Assemblies. 

W.G.M. 


Some Aspects of Analysis and Probability. (Volume IV of Surveys in Applied Mathe- 
matics.) Irving Kaplansky, Edwin Hewitt, Marshall Hall, Jr., and Robert Fortet. New 
York: John Wiley and Sons, Inc., 1958. Pp. xi, 243. $9.00. 


Donatp A. DaruineG, The University of Michigan 


H1s book, as with the others in the same series, attempts to summarize the prog- 
, owe and present status of the subject matter treated. The exposition is as simple 
and elementary as is consistent with the concise and inclusive expositions the authors 
attempt. 

Even so, there are perhaps very few readers, not including the reviewer, who can 
intelligibly read the entire volume, and the publisher might have been well advised 
to publish three or four separate memoirs with a more leisurely pace. Indeed several 
authors attributed their brevity to lack of space. 

The four surveys included in this volume are: “Functional analysis” by Irving 
Kaplansky, “A survey of combinatorial analysis” by Marshall Hall, Jr., “A survey 
of abstract harmonic analysis” by Edwin Hewitt, “Recent advances in probability 
theory” by Robert Fortet. 
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These four eminent authors have done an admirable job of compressing, distilling 
and winnowing, and a reader reasonably conversant with the fields can get a quick 
and fairly accurate picture of the global status of the material treated. Though the 
exposition is, in each case, admittedly eclectic, there are extensive bibliographies to 
aid further reading. Though there can be, of course, no universal agreement on what 
constitutes “the outstanding problems” in the field, the authors give their valued 
opinions. 


Statistical Independence in Probability, Analysis and Number Theory. (Carus Mathemati- 
cal Monographs, No. 12.) Mark Kac. New York: John Wiley and Sons, Inc., 1959. Pp. 
93. $3.00. 


Cuares H. Krart, Michigan State University 


N THIS monograph Professor Kac displays the role of statistical independence as it 
I arises in the problems of the strong law of large numbers, the central limit theorem, 
and in the kinetic implication of ergodic theory. He further studies the role of this 
same notion in the problem of the distribution of prime numbers. 

The first three problems above have well known phenomenological origins whereas 
the fourth is strictly a problem of “pure” mathematics. This contrast is the theme of 
the book and Kae develops it well in the course of discussing these problems. 

The monograph should be interesting reading either for the mathematician who 
wants further insight into the relationship between arithmetical properties of mathe- 
matical models and the observable properties of the realities abstracted by the model, 
or for the empiricist who wants a better understanding of the sort of conclusion and 
suggested experimentation that mathematical theory can provide for his objects of 
observation. 


Statistical Manual. Edwin Crow, Frances A. Davis, and Margaret W. Mazfield. (Navord 
Report 3369, U. S. Ordnance Test Station.) China Lake, California: 1955 (Available 
from Office of Technical Service, U. 8. Department of Commerce, Washington 25, 
D.C.). Pp. xvii, 288. $6.00. 


W. D. Baten, Michigan State University 


HIS manual contains material found in some of the newer statistical texts such as 
Tides about measures of central tendency, measures of dispersion, various types of 
frequency distributions together with tests concerning averages, standard deviations, 
percentages, goodness of fit, normality and regression coefficients under various 
conditions. There is material on fiducial limits, the power of a test, determination of a 
sample size, the sign test, relations of means and variances, Chi-Square, runs, trans- 
formations for obtaining normality, and sensitivity. There is a chapter on planning 
experiments and the analysis of variance that includes fundamentals, one-classifica- 
tion, two-classifications, three-classification designs, tests for homogeneity of vari- 
ances with examples. 

This well written book contains ideas on linear, multiple, non-linear regression, 
correlations, reliability and confidence intervals for predicted values. 

The part on quality control includes control charts for averages, per cent defec- 
tives, number of defectives with their upper and lower limits and also acceptance 
sampling based on either attributes or variables, operating characteristic curves and 
sequential sampling. 

There are 21 valuable tables, several of which do not appear in most textbooks on 
statistics such as: critical values for the M Distribution for testing the null hypo- 
thesis that k populations have the same variance, critical values for runs, critical 
values for tests using the range, for testing whether a trend of means is real, ortho- 
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gonal polynomials, a table for indicating when extreme measurements may be dis- 
carded, constants for a sampling plan with OC curves through 2 points, tables of 
log 1—8/a and log 1—a/8, confidence limits for a proportion. 

The manual contains 11 charts for probabilities of shots falling in certain 2 dimen- 
sional places, confidence belts for proportions, OC curves, number of degrees of 
freedom required to estimate the standard deviation within p per cent of its true 
value, number of degrees of freedom in each sample required to estimate o;/o2 
within p per cent of the true value and confidence belts for the correlation coefficient. 

This book contains a great deal of information and should prove helpful to many. 
It has many illustrations that show how to apply statistical theory to practical 
problems. 


Direction of International Trade: Annual Issue. A Joint Publication of United Nations 
(Statistical Office), International Monetary Fund, and International Bank for Recon- 
struction and Development. Statistical Papers, Series T, Volume X, No. 8. New York: 
Columbia University Press, 1959. $2.50. 


HIS issue provides annual trade-by-country data for the years 1938, 1948, and 
1955-1958, inclusive. With the exception of Bolivia, exports are valued f.o.b., 
whereas imports are valued in some instances c.i.f. and in other instances f.o.b. 
Tables are presented by countries, each table showing the value in U. S. dollars of 
the country’s exports to and imports from all other countries during the years 
covered. A summary table at the beginning reconciles the total distributed by coun- 
tries with the totals adjusted for general international comparison. Another set of 
summary tables presents data on foreign trade for the principal monetary areas of the 
world, 
R. F. 


Yearbook of Labour Statistics 1959. International Labour Organization. Geneva, 1960. 
$5.00. 


ASED on Official data supplied by the various countries, this volume presents sta- 
tistics on a wide variety of subjects of interest to labour, including data on popu- 
lation, employment, unemployment, hours of work, wages, wholesale and consumer 
price indices, family income and expenditures, Social Security, industrial injuries, 
industrial disputes, industrial production, and exchange rates. Data are presented by 
major divisions of economic activity and by specified manufacturing industries. 
Where possible, data are supplied annually for 1952-1959 as well as for the prewar 
years 1937-1938. For some indicators, such as prices, wages, and employment and 

unemployment, data are supplied monthly for recent years. 
Each of the major sections is preceded by useful explanations of the nature of the 

data. The source of each series is provided in a final section. 

R. F. 


International Tax Agreements, Volume VIII. United Nations Department of Economic 
and Social Affairs. New York: Columbia University Press, 1960. $7.50. 


HE eighth in a series of volumes on international tax agreements, this publication 
Sai summary daia on the status as of June 1, 1955 of all international tax 
agreements for the avoidance of double taxation and the prevention of tax evasion. 
The tax agreements are listed separately for each country. In addition, two summary 
tables are presented, one giving a chronological list of all tax agreements by principal 
subject matter, and the other a cross-classification table showing the types of agree- 
ments existing between different nations. 
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This publication supersedes Volumes III and IV of this series issued in 1951 and 
1954. 
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The 1960 Proceedings of 


the Business and Economic - 


Statistics Section and the 
Social Statistics Section of 
the American Statistical 
Association are now avail- 
able. These contain the pa- 
pers and discussions given 
under the sponsorship of 
these Sections at the 120th 
Annual Meeting of ASA at 
Stanford University, August 
23-26, 1960. Price to mem- 
bers for the Business and 
Economic Statistics Section 
Proceedings, $3.50; price to 
non-members, $5.00. Price 
to members for the Social 
Statistics Section Proceed- 
ings, $2.75; to non-members, 
$4.00. ($.25 will be added to 
invoice if remittance does 
not accompany order.) To 
receive the Proceedings at 


the member price, the mem- 
ber’s name must be included 
on the order. Order your 
copies from the American 
Statistical Association, 1757 
K Street, N.W., Washington 
6, D.C. 


VU-GRAPH* 


OVERHEAD TRANSPARENCY 
PROJECTOR! 


PROJECTS A HUGE BRILLIANT 
IMAGE BEHIND YOU AS YOU 
FACE YOUR CLASS 


TEACH IN A FULLY LIGHTED ROOM — to watch 
class reactions, permit note-taking 

PROJECT WHAT YOU WRITE, AS YOU WRITE IT 
lessons, emphasize specific 
po 

PROJECT ONE TRANSPARENCY OVER ANOTHER 
—to build a complete lesson, step-by-step, 
right before your students’ eyes 

TEACH DRAMATICALLY IN ANY SUBJECT — with 
as unlimited as your own imag. 


yom bulld 2 
pin’ Toveriald trans- 
trends such as in- 
foal aid life span, and over- 
population. 


-G OVERHEAD TRANSPA 
ECTO 
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Exclusive 
Monro-Matic 


AUTOMATIC 
SQUARING 


works it far 
faster for you! 


Enter an x value into the Monro-Matic 
Statistical Calculator keyboard only once; 
the machine takes over the squaring; 
automatically, your task is nearly halved. 


This is one of many advanced features 
especially pleasing to people who live with 
figures. Why not ask your Man from 
Monroe for a free demonstration of the 
Monro-Matic Statistical Calculator today? 


Monroe Calculating Machine Company, Inc. 
Sales and service in principal cities everywhere. General offices, Orange, N. J. 
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MO DE for CALCULATING 
ADDING ACCOUNTING 
[B a oF utton woustres DATA PROCESSING MACHINES 
x 


COMPUTING 


SERVICE 


..»- Made to Order 
For Researchers 
and Statisticians 


Since few companies have 
enough work volume to jus- 
tify computers of their own, 
STATISTICAL maintains com- 
puting equipment to serve any 
company on a low-cost, hourly, 
as-needed basis. 

This service is built around 
the combined skills of mathe- 
maticians, statisticians, project 
engineers and programming 
specialists—ready to work for 
you on your computing prob- 
lem. 


Here are a few of the appli- 
cations in which our computer 


STATISTICAL 


TABULATING CORPORATION 


Established 1933 


will excel in saving you time 
and money: 


e Simple and Multiple Cor- 
relations and Regressions 


e Analysis of Variance 
e Factor Analysis 


e Chi Square For A 
Contingency Table 


e Matrix Calculations 
e Linear Programming 
e Curve and Surface Fitting 


e Solution of 
Differential Equations 


Just contact our nearest office 
today for a free analysis and 
cost. estimate of your problem. 


GENERAL OFFICES: 


53 West Jackson Boulevard 
Chicago 4, Illinois 
Phone: HArrison 7-4500 


Chicago @ New York ® St. Louis 
Newark @ Cleveland 

Los Angeles @ Kansas City 

San Francisco @ Milwaukee 
Philadelphia @ Palo Alto 

Van Nuys @ San Jose 


TABULATING - COMPUTING - CALCULATING 


TYPING - TEMPORARY OFFICE PERSONNEL 


Computer Centers in 
New York, Chicago 
and Los Angeles 
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TEXACO wc. 


Offers challenging career opportunities for 
STATISTICIANS 


Expanding research has created openings for statisticians to apply mod- 
ern statistical approaches to problems in research, development and con- 
trol. 


Wide distribution of computer facilities available, with opportunities for 
advancement in scientific and supervisory positions. 


Qualifications: MS or PhD degree, with or without professional ex- 
perience. 


- Locations: Beacon, New York and Houston, Texas 


Address inquiries to: 


Dr. Robert E. Conary 
Texaco Research Center, Beacon, New York 


Mathematical Statisticians 


Exceptional opportunities exist at the Naval 
Weapons Laboratory for mathematical statis- 
ticians with Bachelor’s, Master’s and Doctor’s 
degrees and an interest in operations re- 
search. The principal efforts of the Opera- 
tions Research Group at present are devoted 
to the formulation and execution of extensive 
programs in the areas of target analysis, 
weapons system analysis, and missile feasi- 
bility and evaluation. Senior Statisticians on 
the staff also serve as consultants in areas of 


statistical inference, probability, and experi- 
mental design. The most advanced computing 
equipment and capable junior scientists are 
available for assistance. Starting salaries 
range from $5,880 to $11,595 per annum. The 
Naval Weapons Laboratory provides an ex- 
cellent work atmosphere and, in addition, the 
advantages of living in a pleasant small com- 
munity with economical housing and many 
recreational facilities. 


For further information, write to the Director, 
Computation and Analysis Laboratory. 


NWL 


U. S. Naval Weapons Laboratory 
Department of the Navy © Dahigren, Virginia 
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Careers in 


Mathematics 


Vitro’s increased activities in the field of Operations 
Research have created career opportunities for men 
with these interests and qualifications: 


MATHEMATICAL STATISTICIANS 


MS or PhD for conducting and consulting on analytical 
studies in a wide variety of applications, including informa- 
tion theory, weapons systems analysis, experimental design, 
data treatment. Should be familiar with some of the follow- 
ing—Monte Carlo procedures, Markov processes, decision 
theory, operations research, and have had 3-5 years industrial 
experience in implementing these techniques. Position is in 
the Information Analysis Group. 


OPERATIONS RESEARCH ANALYSTS 


MA or PhD in Mathematics, Statistics or Physics. Conduct 
and direct operations research studies, principally in the 
areas of weapons systems evaluation, ballistic missile de- 
fense, anti-submarine warfare and electronic countermeas- 
ures. Should have experience in some of the following areas: 
applications of game theory, linear programming, Monte 
Carlo techniques, queueing theory and model construction. 


> Our modern laboratory is located in a suburban area with 
easy access to the cultural and educational facilities of metro- 
politan New York and New Jersey. Liberal benefits include 
@ tuition refund plan and relocation allowances. 


Please send resume to 
Mr. S. Roberts. 


Division of Vitro Corporation of America 
200 Pleasant Valley Way West Orange, New Jersey 
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Practical 

Business 

Statistics 
3rd Edition 


Practical 

Problems in 

Business Statistics 
2nd Edition 


Modern 

Elementary 

Statistics 
2nd Edition 


pee PRENTICE-HALL INTERNATIONAL SERIES IN MANAGEMENT TID omy 


Executive Decisions and Operations Research 


by Davip W. MILLER and Martin K. Starr, both of Columbia Univer- 
sity and Consultants in Operations Research and Management Science 


by FREDERICK E. CROXTON, Columbia Uni- 
versity, and DUDLEY J. COWDEN, University 
of North Carolina 


The new Third Edition offers an early introduction 
to and a constant emphasis on statistical inference. 
It stresses concepts rather than computation and pro- 
vides short, simple chapters on mathematics. 


1960 701 pp. Trade price: $10.35* 


by DUDLEY J. COWDEN and MERCEDES S. 
COWDEN, both of University of North Carolina 


This text offers new problems with all the technical 
and graphical materials presented and gives particu- 
lar attention to questions which pertain to theory. 
Although organization, symbols and formulae con- 
form to PRACTICAL BUSINESS STATISTICS, 
3rd Ed., the new problems book is suitable for use 
with any standard text on business statistics. 


1960 112 pp. Price: $4.25* 


by JOHN E. FREUND, Arizona State University 


Containing the same scope as the earlier edition, 
this new revision provides an increased emphasis on 
statistical inference. Certain symbolism and defini- 
tion of standard deviation have been updated. 


1960 413 pp. Trade price: $9.25* 


The authors provide a basic understanding of decision theory, operations 
research and other recent developments of importance for the rational 
analysis of executive decision problems for business executives and others. 


1960 446 pp. Trade price: $10.00* 


Prediction and Optimal Decision: 


Philosophical Issues of a Science of Values 
by C. West CHURCHMAN, University of California at Berkeley 


Here is the first systematic examination of the scientific basis of decision 
theory which represents a treatment of the relationship between science 
‘and decision making. 
January 1961 


Approx. 416 pp. Trade price: $9.00* 


*Also available in a textbook edition for quantity sales to colleges 


For approval copies, write: Box 903, Dept. ASA 
PRENTICE-HALL, Inc. 
Englewood Cliffs, New Jersey 
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INTERNATIONAL BUSINESS 
MACHINES CORPORATION 


COMPUTER PROGRAMMING at 18M is being extended to 
include many new areas—such as orbit computation, meteoro- 
logical satellites, space probes, information retrieval, design auto- 
mation, real-time systems, and optical studies. As a result, we are 
greatly expanding our programming staff, creating opportunities 
for people with various levels of experience. Assignments involve 
a wide variety of problems in science, business and government. 


Qualifications: Degree in Math, Statistics, the Physical Sciences, 
Engineering or Engineering Science . . . plus one year’s program- 
ming experience. 


MATHEMATICS RESEARCH at 18M involves interesting chal- 
lenges in a wide variety of areas. These include matrix algebra; 
logic; mathematical physics; and probability, communication and 
information theory. Other fields which are also being subjected to 
intensive study are numerical analysis, combinatorial topology, 
and operations research. 


Qualifications: B.S., M.S., or Ph.D. in Math, Physics, Statistics, 
Engineering Science, or Electrical Engineering — and proven ability 
to assume important technical responsibilities in your sphere of 
interest. 


APPLIED MATHEMATICS offers unusually fine opportunities 
for the math-oriented. You will be asked to apply your knowledge 
of mathematical and statistical analysis, probability, logic, and 
coding to advanced computer development problems. Assignments 
may take you into computer systems design, component engineer- 
ing, human factors engineering, or feed-back control theory, infor- 
mation and communication theory, inertial guidance, and scientific 
programming. 

Qualifications: B.S. or Advanced Degree in Math, Physics, or Sta- 
tistics — plus related experience. 


There is a wide and diverse range of career opportunities at IBM. 
Advancement is rapid. The demands of a constantly expanding 
program of research and development, and promotion from 
within, based on individual merit and achievement, make this 
possible. Working alone or on a small team, you'll find that spe- 
cialized assistance is readily available. 


For details, write, outlining your background and interests, to: 
Mr. D. R. Morrisey, Dept. 577Z 

IBM Corporation 

Product Development Laboratory, Endicott, N.Y. 
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IBM. 


TABLES OF 
THE HYPERGEOMETRIC 
DISTRIBUTION 


Gerald J. Lieberman and Donald B. Owen. For the first time, the hy- 
pergeometric distribution is fully tabulated and the results of exact 
machine computation are available. Of great use in mathematics, sta- 
tistics, the social and physical sciences, and engineering. Stanford 
Studies in Mathematics and Statistics, III. 

February. About $15.00 


STUDIES IN ITEM ANALYSIS 
AND PREDICTION 


Edited by Herbert Solomon. This integrated series of mathematical 
studies presents many new theoretical developments in both test de- 
sign and the classification of individuals on the basis of responses to 
tests. Stanford Mathematical Studies in the Social Sciences, VI. 
March. About $8.75 


MARKOV LEARNING 
MODELS FOR MULTIPERSON 
INTERACTIONS 


Patrick Suppes and Richard C. Atkinson. A major attempt to bridge 
the gap between statistical learning theory and social psychology by 
applying stimulus-sampling techniques to social interaction situations. 
Stanford Mathematical Studies in the Social Sciences, V. 

Ready. $8.25 


Fr Order from your bookstore, please 
=] Stanford University Press 
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Stanford Studies in Mathematics and Statistics 
TABLES OF THE NON-CENTRAL 
t-DISTRIBUTION 


Density Function, Cumulative Distribution 
Function, and Percentage Points 
George J. Resnikoff and Gerald J. Lieberman. 


CONTRIBUTIONS TO PROBABILITY 
AND STATISTICS 


Essays in Honor of Harold Hotelling 
Olkin, Ghurye, Hoeffding, Madow, and Mann, Editors. $6.50 


Stanford Mathematical Studies in the Social Sciences 


STUDIES IN THE MATHEMATICAL THEORY 
OF INVENTORY AND PRODUCTION 
Kenneth J. Arrow, Samuel Karlin, and Herbert Scarf. - $8.75 


STUDIES IN LINEAR 
AND NON-LINEAR PROGRAMMING 


Kenneth J. Arrow, Leonid Hurwicz, and Hirofumi Uzawa. $7.50 


STUDIES IN MATHEMATICAL 
LEARNING THEORY 
William K. Estes and Robert R. Bush. 


MATHEMATICAL METHODS 
IN THE SOCIAL SCIENCES, 1959 
Kenneth J. Arrow, Samuel Karlin, and Patrick Suppes, Editors. $8.50 


Order from your bookstore, please 
Stanford University Press 
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PRINCETON UNIVERSITY PRESS 


The Quality and Economic Significance of Anticipations Data 
National Bureau of Economic Research, Special Conference 
Series, No. 10 


Government, industry, and households must constantly make decisions about 
buying, selling, or investing that are based on the expected future behavior of 
our economy and that collectively guide the course of economic development. 
This volume weighs the predictive value of anticipations data both as an aid 
to forecasting and as a clue to how the total economic process generaies ex- 
pectations. Published for the National Bureau of Economic Research. 

504 pages. Tables & Charts. $9.00 


Soviet Statistics of Physical Output of Industrial Commodities: 
Their Compilation and Quality 


By Gregory Grossman 


The first of e series on Soviet economic growth, this book seeks to appraise 
the quality of Soviet statistics of industrial output: how they are collected, 
processed, and published. Dr. Grossman analyzes the obstacles to an accurate 
and unbiased flow of information, given the plans, incentives, and penalties of 
the Soviet “command economy.” By highlighting the shortcomings of Soviet 
data, this study provides a basis for the evaluation of Soviet claims of indus- 
trial growth. Published for the National Bureau of Economic Research. 

172 pages. $4.50 


Statistical Measures of Corporate Bond Financing Since 1900 
By W. Braddock Hickman 


This third and last volume of the National Bureau of Economic Research cor- 


porate bond research project contains basic statistics on which the analyses 
of the second volume are founded, supplementary tables on topics not covered 
in that report, and a description of estimating procedures and notes on cover- 
age and suggested uses. Published for the National Bureau of Economic 
Research. 

616 pages. Tables. $9.00 


Order from your bookstore, or 
Princeton University Press. Princeton, N. Y. 
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Ready early January! 


ELEMENTS OF STATISTICAL INFERENCE 


BY ROBERT M. KOZELKA, Williams College 


This new textbook is designed for a one-semester course in statistical in- 
ference, following an introductory course in the calculus. Primarily in- 
tended for students of the social sciences, it may be used by anyone wishing 
a foundation upon which to base his later study of the techniques appro- 
priate to his particular field of interest. 


The book offers an elementary mathematical treatment of the classical ideas 
of estimation and hypothesis testing, introducing probability through set 
theory. The aim of the author has been to suggest to the student how the 
statistician thinks, why he thinks that way, and some of the things he is apt 
to think about. 


c. 160 pp, 47 illus, 1961—probably $5.00 


_ ADDISON-WESLEY PUBLISHING COMPANY, INC. 


Reading, Massachusetts 


MODERN FACTOR ANALYSIS 


By Harry H. Harman 


SOPHISTICATED, accurate, and up-to-date account of 

factor analysis from its basic foundations through the 
latest and most advanced techniques, including the use of 
high-speed electronic computers. 


Designed for use as both a text and a reference, this book will 
ably serve the needs of graduate students and researchers 
using factor analysis as a statistical tool. 

480 pages. 1960. $10.00 


Through your bookseller f 
UNIVERSITY OF ( PRESS 160 Avenve, 17, 
In Canapa: The University of Toronto Prese, Toronto 5, Onterie 
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THE COMPLETE INTRODUCTION TO 


Principles of STATISTICAL ANALYSIS 


Samuel B. Richmond, Columbia University 


A comprehensive textbook designed 
as an introduction to statistical analysis 
for students in business and economics. 
Detailed illustrative material combines 
with the text to present a thorough treat- 
ment of the collection, analysis, and pre- 
sentation of statistical data. Organized 
around the modern concept of statistical 


induction, book keeps mathematical pro- 
cedures to a minimum by introducing and 
explaining the techniques employed at the 
point of use. A Glossary of Equations lists 


‘each basic equation used in the text. 


“Well organized ... excellent.”—Ap- 
VANCED MANAGEMENT. 1957. 491 pp.; 210 
ills., tables. $6.50 


For clearer visualization of facts and figures ... 


Handbook of GRAPHIC PRESENTATION 


Calvin F. Schmid, University of Washington 


A working guide for all concerned with 
producing meaningful, interesting charts 
and graphs. Handbook shows how compli- 
cated data can be put into easily intelligi- 
ble form; fully analyzes each basic type 
of chart, indicating the advantages and 
disadvantages in information 
of different kinds; gives helpful pointers 


on avoiding difficulties in construction. 
Includes a detailed discussion of three- 
dimensionals, plus scores of examples from 
a wide variety of fields. “A highly credit- 
able job of producing a reliable and help- 
ful handbook.”—JourNaL OF THE AMER- 
IcAN STATISTICAL AssociATION, 1954. 316 
pp.; 210 ills., tables. $6.50 


THE RONALD PRESS COMPANY ° 15 East 26 St., New York 10 


Ready in January 
ELEMENTS OF MODERN STATISTICS 


for Students of Economics and Business 
By Boyd L. Nelson, University of Maryland 


This new text, designed for a one-semester course in elementary business and 
economic statistics, emphasizes statistical concepts and the pervasiveness of 
sampling and inductive inference. The aim of the book is to expose the student 
to the subject to such a degree that he can make statistics work for him in his 
field, and to familiarize him with the immense power and usefulness of sta- 
tistics as well as to understand its limitations. Concise and to the point, the 
clarity of writing breaks through the mathematical “language barrier” and 
makes the material readily understandable to students with a minimum of 
mathematical preparation. Sufficient material is included to lay a solid founda- 
tion for advanced work in the field of statistics. 


about 440 pages illustrated about $6.00 


APPLETON-CENTURY-CROFTS, INC. 
35 West 32nd Street New York 1, New York 
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ELEMENTARY STATISTICAL METHODS IN PSYCHOLOGY 
AND EDUCATION 
Paul Blommers and E. F. Lindquist 


A basic text which encourages the beginner in the uses and interpretation of statistics, 
stressing the importance of a critical evaluative attitude. 
528 pages 1960 $5.75 


Study Manual for Elementary Statistical Methods in Psychology $2.00 


STATISTICAL REASONING IN SOCIOLOGY 
John H. Mueller and Karl F. Schuessler 


Designed for the sociologist, the aim of this new text is not only to teach basic statistical 
techniques, but also to make plausible and comprehensible in humanistic terms the 
fundamentals of statistical reasoning. An Instructor's Key to the questions and problems 
will be available. 

An Early 1961 Publication 


HOUGHTON MIFFLIN COMPANY 


New York ATLANTA GENEVA 


MACMILLAN TEXTS IN STATISTICS 


STATISTICAL METHODS IN EXPERIMENTATION: 
An Introduction 
OLIVER L. LACEY, University of Alabama 


Presents elementary statistics in a meaningful, logical manner . . . emphasizes dis- 
- cussion of underlying principles of statistics in almost any field of experimentation. 


1953, 249 pages, $5.50 


STATISTICS IN EDUCATION 
MERLE W. TATE, University of Pennsylvania 
A general introduction to statistical methods as applied in educational measure- 
ment and research . . . encourages correct use and interpretation of descriptive 
statistics. 1955, 597 pages, $6.25 
EDUCATIONAL MEASUREMENT 
ROBERT M. W. TRAVERS, University of Utah 
Designed to develop student judgment of appropriate uses for measurement . . . 
stresses psychological and educational theory indispensable to application of tech- 
niques. 1955, 420 pages, $5.75 
ELEMENTARY STATISTICS FOR JOURNALISTS 
DAVID MANNING WHITE, University of Colorado and SEYMOUR LEVINE 
Furnishes basic statistical concepts in a frame of reference meaningful tc the jour- 
nalist . . . work problems at chapter ends based on actual communications problems. 
1954, 84 pages, paper, $1.75 


The Macmillan 


60 FIFTH AVENUE, NEW YORK 11,N. Y. 


Please mention the Journal of the Amentcan Statistical Association in writing advertisers 


PALo ALTO 
Company 


TECHNOMETRICS 
A Journal of Statistics for the Physical, Chemical and 


Engineering Sciences 


VoL. 2, No. 3, AUGusT, 1960 


The Compound Hypergeometric Distribution and a System of Single 
Sampling Inspection Plans Based on Prior Distributions and Costs - 
A. H. 


Some Remarks on the Bayesian Solution of the Single Sampling Inspec- 
tion Scheme G. B. Wetherill 


Serial Sampling Acceptance Schemes Derived from Bayes’s Theorem D. R. Cox 


Discussion of the Papers of Messrs. Hald, Wetherill and Cox . .G. A. Barnard, 
D. V. Lindley, B. Hill, F. J. Anscombe, I. ]. Good, and G. Horsnell 


Variations Flow Analysis Norbert L. Enrick 
A Semigraphical Method for the Analysis of Complex Problems E. Anderson 
Inter-plant Storage in Continuous Manufacturing H. D. Miller 


Estimation of the Parameters of Two Parameter Exponential! Distributions 
from Censosed Samples B. Epstein 


VOL. 2, No. 4, NOVEMBER, 1960 


Statistical Life Test Acceptance Procedures ............++e005: B. Epstein 
Estimation from Life Test Data Epstein 


Some New Three Level Designs for the Study of Quantitative Variables 
G. E. P. Box and D. W. Behnken 


Graphical Procedure for Fitting the Best Line to a Set of Points ..J. L. Dolby 


Tables of Tolerance-Limit Factors for Normal Distributions 
Alfred Weissberg and Glenn H. Beatty 


On the Evaluation of the Negative Binomial Distribution with Examples 
G. P. Patil 


On Methods of Constructing Sets of Mutually Orthogonal Latin Squares 
Using a Computer ...... R. C. Bose, 1. M. Chakravarti and D. E. Knuth 


Technometrics is published quarterly in February, May, August and November. To 
members of the American Statistical Association and the American Society for Quality 
Control the annual subscription rate is $6.00. The annual non-member subscription rate 
is $8.00. Checks should made payable to Technometrics and addressed to Tech- 
nometrics, Post Office Box 587, Benjamin Franklin Station, Washington 6, D.C. 


Please mention the Journal of the Amenican Statisticat Association in writing advertisers 


The Annals of Mathematical Statistics 


THE OFFICIAL JOURNAL OF THE INSTITUTE OF 
MATHEMATICAL STATISTICS 


December 1960 Vol. 31, No. 4 


Contents 


Simplex-Sum -Designs: A Class of Second Order Rotatable Designs Deizebin from 
those of First Order weeeseesG. B. P. Box and D. W. Bebnken 


Third Order Rotatable Designs in Three Dimensions .......... ...-Norman R. Draper 
A Third Order Rotatable Design in Four Dimensions Norman R. Draper 
Some Aspects of Weighing Designs Damaraiu Raghavarao 
Random Allocation Designs I: On General Classes of Estimation Methods A. P. Dempster 


A Mixed Model for the Caesphite Three-Way Layout with Two Random-Effects Fac- 
t J. P. Imbof 


Tests for Regression Coefficients pra Errors are Correlated ..........-- M. M. Siddiqui 


Mixed Model Variance Analysis with Normal Error and "Possibly Non-Normal 
Other Random Effects, Part I: The Univariate Case ..S. N. Roy and Whitfield Cobb 


Mixed Model Variance with Norma! Error ont Non-Normal Other 
Random Effects, Part Il: The Multivariate Case ....5. N. Roy and Whitfeld Cobb 


An Associated Polynomial Least Square Richard Warren 
Asymptotic Rate of Discrimination for Markov Processes ........Lambert H. Koopmans 
Equalities for Stationary Processes Similar to an Equality of Wald ....Sha-Teh Chen Moy 
Multivariate Chebyshev Inequalities ............Albert W. Marshall and Ingram Olkin 
On Deviations of the Sample Mean ............++ ...R. R. Bahadur and R. Ranga Rao 


On the easecrtoneneed of a Sample Caet Moment and the Sample Mean 
G. Laha, E. Lukacs and M. Newman 


Order Statistics of Partial G. Wendel 


Probability Distributions Related to Random Mappings ........+..+.+. Bernard Harris 
Some Asymptotic Results for a Coverage Problem ....,....+++++++++++++-Max Halperin 


are Componen 
Morrison H. A. Davia 


Maximizing the Probability that Adiocms Gate snes of Samples from Several 
Populations form Overlappin ng Interval 
Richard Cohn, Frederick pS John W. Pratt, and Maurice Tatsuoka 


Normalizing the Noncentral t and F Distributions .... Nico F. Laubscher 


Probability Content of Regions Under Spherical Normal Distributions, Il: The Dis- 
tribution of the Mean in Normal Samples .........++ese+e+e+-00eeHarold Ruben 


Tables of Range and Studentized Range Leom Harter 


Asymptotic Formuiae for the Distribution of Hotelling’s Statistic, 


On Unbiased Estimation ........ Schmetterer 
Two-Stage Experiments for Estimating a Sin Mean ...+....4++.4++.Donald Richter 
Rank-Sum Tests for Dispersions ............- seseeeee-4. R. Ansari and R. A. Bradley 


A Relationship between Hodges’ Bivariate Sign Test aa a Non-Paramettic Test of 
Daniels’ Bruce M. Hill 


Minimax Sequential Tests of Some Composi' DeGroot 


Inverse Sampling Plans When an Exponential bution is with 
Censoring ...... £6056 . Jack Nadler 


NOTES 


A Conservative Property of Binomial Tests .H. avid 
An Optimum Property of Regular Maximum Likelihood Estimation ..V. P. Godambe 


= 


Address orders for subscriptions and back numbers to Professor George E. Nichol- 
son, Jr., Secretary, Institute of Mathematical Statistics, Department of Statistics, 
University of North Carolina, Chapel Hill, North Carolina. 


Please mention the Journal of the AMEricaN Statistical AssociaTIOn in writing advertisers 


THE JOURNAL OF FINANCE 


The Journal of THE AMERICAN FINANCE ASSOCIATION 


Vol. XV September 1960 No. 3 


Organized Securities Exchanges in Canada James E. Walter and J. Peter Williamson 
Some Loanable Funds Concepts and Banking Theory ........+eeseeceseeereceecees E. R. Wicker 
The Role of Equipment Obligations in Post War Railroad Financing . . Donald M. Street 


The Trends» Security Portfolios Financial Corporations, investment. Donald 


Pricing A Banking Service—The Special Checking Account .........se+eeeeeeeee Martin H. Seiden 
The Earnings Performance of the Consumer Finance Industry Sidney Cottle 
Abstracts of Doctoral Dissertations - ° 

Activities of Student Affiliates 

Book Reviews 

Books Received 


_ H. Administration, New York 
University, 100 Trinity Place, New York 6, New York. 


Communication relating to the contents of The Journal of Finance should be addressed to the Editor, 
Joel Segall, Graduate School of Business, University of Chicago, Chicago 37. iiline®. or to the 
‘Associate Editor, John G. Gurley, Brookings Institution, 722 Jackson Place, N.W., Washington 6, 


AMERICAN ECONOMIC REVIEW 


VoLuME L SEPTEMBER 1960 NuMBER 4 


ARTICLES 
First Two Decades: American Economic Association ..........-- eee err A. W. Coats! 


Patterns of Industrial Growth ...... 
Credit Controls and Financial Intermediaries ............ D. A. Albadef 


REVIEW ARTICLE 


COMMUNICATIONS 
W. G. Bowen, R. G. Davis, and D. H. Kopf 
Capital Formation in Underdeveloped Countries Nathdn Rosenberg 
Wages and Interest—A Modern Dissection of 
Marxian Economic Models: Comment F. M. Gottheil 
The Pure Theory Trade: H. G. Johnson 


The AMERICAN ECONOMIC REVIEW, a quarterly, is the official publication of the 
American Economic Association and is sent to all members. The annual dues are six dollars. 
Aeon editorial communications to Dr. Bernard F. Haley, Editor, AMERICAN ECONOMIC 

VIEW, Stanford University, Stanford, California. For information concerning other publi- 
oaieas and activities of the Association, communicate with the Secretary-Treasurer, Dr. 
lames Washington Bell, American [renee Association, Northwestern University, Evanston, 
llinois. Send for information booklet 


Please mention the Journal of the Amenican Statisticat Association in writing advertisers 


| dues, including $3.00 allocated to subscritpion in The Journal 4, Finance, are $5.00 
annually, Student subscription is $2.00 a year. Libraries may subscribe to The Journal at $5.00 
and be for for membership in the Ameri- 


POPULATION STUDIES 


A Journal of Demography 
Edited by D. V. Grass and E. GreBenik 


Vol. XIV No. 2 CONTENTS November 1960 


T. E. SMITH. The Cocos-Keeling Islands: A Demographic Laboratory. 
FLANN CAMPBELL. Birth Control and the Christian Churches. 
W. BRASS. The Graduation of Fertility Distributions by Polynomial Functions 


JUDAH MATRAS. Comparison of Intergenerational Occupational Mobility Pat- 
terns: An Application of the Formal Theory of Social Mobility. 


Book Reviews. 


Subscription price per volume of 2 parts 42s. net, post free 
(or American currency $6.75). 
Single parts £1 each plus postage (American $3.25, post free). 
Published by the POPULATION INVESTIGATION COMMITTEE, at the LONDON SCHOOL 
OF ECONOMICS and POLITICAL SCIENCE, 15 HOUGHTON STREET, LONDON, W.C.2. 


JOURNAL OF BUSINESS 


Graduate School of Business, University of Chicago, Chicago 37, Illinois 


Volume XXXIII OCTOBER 1960 Number 4 


The Carnegie Tech Management Game ..........+++++++055 K. ]. Coben and others 
The New Competition—International Markets: How Should We Adapt? . . Yale Brozen 
The MAPI Urgency Rating as an Investment Ranking Criterion 

Price Policy and Discounts in the Medium- and High-priced Car Market ....Allenm Jung 
The Role of Law in Education for Business Jacob Weissman 
Reduced Costs Through Job Enlargement: A Case Maurice Kilbridge 
Economic Analysis in Norwegian Collective Bargaining ............. G. M. Donhowe 
An Experimental Method for Estimating Demand Edgar A. Pessemier 


The JOURNAL OF BUSINESS is published quarterl the University of Ch Press. Subscrip- 
tions are $6.00 per year and should be addressed to Tr TOURNAL OF BeisInESe Graduate School 
of Business, University of Chicago, Manuscripts in duplicate, typed and double-spaced (including 
footnotes and quotations), and editorial correspondence should be addressed to Irving Schweiger, 
Editor, JOURNAL OF BUSINESS, at the same address. 


Please mention the Journal of the American Statistical AssoctaTion in writing advertisers 


ALBANY 

ARIZONA 

AUSTIN 

Boston 
Burra.o-NIAGARA 
CrEnTRAL INDIANA 


CrnTrRAL Iowa 


CrnTRAL New JERSEY 


CuicaGo 
CINCINNATI 


CLEVELAND 


Co.Lorapo-W YoMING 


CoLumBus 
CoNNECTICUT 


DAYTON 
DETROIT 
HARRISBURG, Pa. 
Hawau 

ILLINOIS 

ITHACA 
MILWAUKEE 


MoNnTREAL 
NEBRASKA 


New 
New Yor« 


Nortu CaRo.ina 


Norra Texas 
PHILADELPHIA 


PITTSBURGH 


Puerto Rico 
Rocuester, N. Y 


SacRAMENTO 
San Francisco 


Caurrornia Charles I. Landenberger 


Sratre Pa. 


Sr. Lovurs 
Tusa 


CHAPTER PRESIDENTS 


Helen C. Chase, Principal Biostatistician, New York State Dept. of 
Health, 84 Holland Ave. ., Albany 8, New York 

David Cooper ny ke Arizona Public Service Co., 501 South Third 
Ave., Phoeniz, A 

Woerner, Dept. Public Safety, North Austin Station, 

ustin, 

David Durand, 52-480, Massachusetts Inst. of Technology, Cam- 
bridge 39, Massachusetts 

Alfred Blumstein, Cornell Aeronautical Laboratory, 4455 Genesee 
Street, Buffalo 21, New York 

James A. orton, Jr., Statistical Laboratory, Purdue University, 
Lafayette, Indiana 

Harvey N. Albond, City Plan and Zoning Comm., Des Moines, Iowa 

Mrs. Gladys W. Ellsworth, Research & Statistics, N. J. State Dept. 
Cons. & Economic Development, 520 E. State St., Trenton 7, N. J. 

Robert L. Seidner, Chicago’s American, 326 W. Madison "Street, 
Chicago 6, Illinois 

Carl Smith) Institute o } 4. Medical Res., Christ Hospital, Auburn 


Avenue Cincinnati, 
H. The World Publishing Co., 2231 W. 110th 


Charles H. Joseph, Jr 
Street, Cleveland 2, ‘Ohio 

Pearl A, Van Natta, Child Research Council, School of Med., Uni- 
versity 0, Colorado, 220 E. 9th Avenue, Denver, Co 

Roy L. i ams, 274 West Kenworth Road, Columbus 14, Ohio 

Seal, D of Osborn Zoological cal Laboratory, 

ew Haven, 

J. Stansbrey, 314 Wisteria 19, Ohio 

Harry oe leony Research Center, Univ. of M ich., Ann Arbor 

Dewey O ster, Div. of Cro Reporting, Dept. of Agriculture, 
25, South Office Bldg., Harrisburg, Pennsylvania 


Paul T. Tajima, Real Property Valuation Engineer, Office of Terri- 
torial Tax Comm., Honolulu 13, Hawaii 

Thomas A. Yancey, Dept. of Economics, University of Illinois, 
- S. Wright Street, Urbana, Illinois 

C. Henderson, Department of Husbandry, Cornell University, 


New York 
Edward F. Hornick, Wisconsin Telephone Company, 740 North 
Broadway, Rm. $20B, Milwaukee 2, Wisconsin 
Roger Lessard, Ecole Polytechnique, , 1430 St. Denis, Montreal, Que. 
Edgar Z. Palmer, Chairman, Dept. of Business Research. "Social 
Science Bldg. 310A, University of Nebraska, Lincoln 8, Nebraska 
Roland Pertuit, 4871 Metropolitan Drive, New Orleans, "Louisiana 
Robert E. Lewis, Economics Department, First National City 
Bank of N. Y., New York 15, N. Y. 
Bernard G. Greenberg, Dept. of Biostatistics, School of Public 
Health, University of North Carolina, Chapel Hill, North Carolina 
Leroy Folks, 713 § herwood, Richardson, T 
Benjamin J. Tepping National y on Ey Inc., 1015 Chestnut 
St., Philadelphia 7, Pennsylvania 
Philip Hermann, Jones & Laughlin Steel Corp., $3 Gateway Center, 
Pittsburgh, Pennsylvania 
Alvin Mayne, 2169 Calle Gen Del Valle, Santurce, Puerto Rico 
Donald A. right, Paper Service Div. b-57, Eastman Kodak Co. a 
Kodak Park Works, Rochester, N. Y. 
Robert H. Gustafson, Division of Research & Statistics, California 
Bad. ization, 1020 N Street, Sacramento, California 
. of Economics, San Francisco State ollege, 
Pacific T 740 8. Olive 
‘act. lompany, 
St., Room 1156, oy s Angeles 55, California 
Robert W. Kautz, School of Business, Pennsylvania State Univer- 
sity, University Park, Pennsylvania 
George = Little, Southwestern Bell Telephone Co., 1010 Pine 
Lyndral Hospital & Ph 
yndr arcum, Blue Cross ysician 
Service, P.O. Boz 1738, Tulsa, Oklahoma 


Twin Crtres (Mtnn.) Harold J. Guide, 5917 Chowen Avenue South, Minneapolis 10 


VIRGINIA 


James Armstrong, Jr., E. I. du Pont de Nemours & Co. Inc., P.O. 
Boz 1477, Richmond 12, Virginia 


Wasuineron, D. C. Wolfbein, Deputy Asst. Sec’y for Manpower, U. S. 


of Labor, Washington 265, D. C. 


Y Elizabeth H. Research A MN. Y. Btate Dept. 
snd Mp, flor Big Albany, W. ¥. 
Boston E. Alman, Offes of Siatictical & Res. Bavice, Boston Unie 
versity, 786 Commanwealth Ave., Boston 16, Mase. 
Norman C. Severo, Aapeciate Proj. of Mathematical Siat., Dept. 
Iowa 0. of Industrial Baginegring, lows Sate 
nivereity, Ames, Iowa 
James Tumbuseh, Gereral Electric Company, Testing Operetion, 
F.P.L.D., Buildiey 000 Cincinnad’ 15 OMe 
Morris Dainoveky, Market Research Services, 665 
Cororapo-W rom Donal N. Livingston, 1644 South Ivy Way, Denver 28, Cola, 
Darrow Mra. Witherow Kurts, 9664 North Gettysburg Avenus, 
A PA, urn 
Henry F. Kaiser Bureau of Educational Research, University of 
Ienaca Philip J. MeCarthy, New York Industridland Later. 
; Relations, Cornell U Ithace, New York 
Joseph W- MoGes, Dept, Bocislogy, Marquette 
Mowremat James A, Coombs, The Boll Telephone Co. of Connda, Montreth, 
Naw Usien H Canter, $76 7th New York,N.Y. 
‘Norra Cazouna. Roy Kuebler,.Je Dept. Disstetistics, of Neh 
Norra Texas = son, Computing habwetery, Southern 
versity, Dallas #8, Texas 
Pamapsiema Frederick N. Sass, 6418 Lawnton Avenue, Philadelphia Pa 
Pind, Di Research & Btatietics, Dept. ef Buploymet, 
San Francisco ——-Roderick Colifornie Economic Desslopment Agency, 
281 World Trade Center, Prenciece 11, rg com 
Stare Coutacs, Pa. George G. Burgess, BD Bos 48-8, State 
Sr. Lobis A. J. Meigs, Federal Reserve Bank of P. O. Boz 448, 
Manian University of Tules, Oblahoms 
Wasmmeton, D.C. Edwin Cold , Stalietical Reports Disision, 


New Books from McGtaw-Hiull... 

ELEMENTS OF THE THEORY OF MARKOV — 

PROCESSES AND THEIR. APPLICATIONS 

By A. T. Bharucha-Reld, University of Oregon. ——— Series in 
Probability and Statistics. 469 pages. $11.50 


A graduate-level text and reference in advanced statistics with numerous applica- 
tions to several fields of science. The author presents an introduction to the theory 
of Markov processes, and also gives a formal treatment of mathematical models 
based on this theory which Deen Fhe 
emphasis is on application. 


AN INTRODUCTION To LINEAR 

STATISTICAL MODELS Vol, I 

By Franklin A. Graybill, Okishoms Stste Univershy. The McGraw-Hill 
Series in Probability and Statistics. Ready in January, 1961. 


This excellent text has been written to fulfill two needs: (1) fora theory textbook 
for seniors and first year graduate students in statistics, and (2) for a reference 
book in the area of regression, correlation, least squares, experimental design, etc., 
semester course. 


AN INTRODUCTION TO MATHEMATICS 

FOR BUSINESS. ANALYSIS 

By Robert C. Meier, General Mills, Inc., and Stephen H. Archer, University 
of Washington, 284 pages, $6.95. 

For businessmen and students of business and economics interested in learning about 

the uses of mathematical and statistical techniques in the solution of business prob- 

lems, but whe: lack formal training in these subjects. 


INTRODUCTION TO PROBABILITY | 
AND RANDOM VARIABLES 


George P. Wadsworth, Massachusetts Institute of Technology; and Joseph 
G. Byran, American Machine and Foundry Company, Stamford, Connecticut. 
McGraw-Hill Series in Probability and Statisties. 304 pages. $8.75 


A text {or an undergraduate course in the mathematical theory and practical applica- 
tions of probability. It is also suitable for eelf-study by practicing engineers and re- 
search workers. Elementary calculus is a prerequisite. Auxiliary mathematical ma- 
terial is supplied as needed, but highly compact terminology or specialized symbols aré 
$0: keep the Book well within the scach Of 


calculus. 
Send for copies on-approval 


McGraw-Hill Book Company, Inc. 


330 West 42nd Street New York 36, New York 


Please mention the Journal ef the Amenican Statistica Association in writing cdvertisers 


¢ 


